Obtaining and applying public data for training students in technical statistical writing: Case studies with data from U.S. Geological Survey and general ecological literature
Effective undergraduate statistical education requires training using real-world data. Textbook datasets seldom match the complexities and messiness of real-world data and finding these datasets can be challenging for educators. Consulting and industrial datasets often have nondisclosure agreements. Academic datasets often require subject area expertise beyond those of a general education or lack connections to real-world applications. Many governments, including the United States, now require the release of data from projects they directly complete or fund though grants and contracts. We show how statistical educators may find datasets and incorporate them into courses. Specifically, we use two examples from the U.S. Geological Survey (USGS) and one example from the ecology literature. We demonstrate the use of these datasets in an upper-level analysis of variance (ANOVA) class. In addition to describing how we found the datasets, we describe how to include them into course work and the course’s student assessments. We have used these datasets over multiple semesters and included student feedback from these courses. Although our examples focus on an ANOVA class, the general methods for finding data shared here could be used for statistical classes ranging from high school to graduate education. Supplementary materials for this article are available online.
|Obtaining and applying public data for training students in technical statistical writing: Case studies with data from U.S. Geological Survey and general ecological literature
|Barb Bennie, Richard A. Erickson
|Journal of Statistics and Data Science Education
|USGS Publications Warehouse
|Upper Midwest Environmental Sciences Center