Data quality control for everyone: a course and recipes for well-documented data workflows in R
Developing training materials that cover data quality control using the R programming language
High quality data are essential to USGS activities and products, including scientific research, data releases, and communication. Managing data quality throughout its lifecycle supports this goal. USGS personnel use a range of approaches to process and document issues with data, from spreadsheet-based methods to custom code. We aim to support these approaches by developing training materials that cover data quality control using the R programming language. We will demonstrate the value of using scripts for data processing and provide example code for how to perform a variety of checks. By providing examples, research scientists will be able to implement improved workflows for data quality control. Public training materials will also help data scientists and reviewers develop and communicate about data quality control processes. We will share these materials via a website published as a software release, which will make them accessible to the CDI and USGS community.
Developing training materials that cover data quality control using the R programming language
High quality data are essential to USGS activities and products, including scientific research, data releases, and communication. Managing data quality throughout its lifecycle supports this goal. USGS personnel use a range of approaches to process and document issues with data, from spreadsheet-based methods to custom code. We aim to support these approaches by developing training materials that cover data quality control using the R programming language. We will demonstrate the value of using scripts for data processing and provide example code for how to perform a variety of checks. By providing examples, research scientists will be able to implement improved workflows for data quality control. Public training materials will also help data scientists and reviewers develop and communicate about data quality control processes. We will share these materials via a website published as a software release, which will make them accessible to the CDI and USGS community.