Data to support Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies
October 4, 2023
This data release contains one dataset and one model archive in support of the journal article, "Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies," by Jennifer C. Murphy and Jeffrey G. Chanat. The model archive contains scripts (run in R) to reproduce the four machine learning models (logistic regression, linear and quadratic discriminant analysis, and k-nearest neighbors) trained and tested as part of the journal article. The dataset contains the estimated probabilities for each of these models when applied to a training and test dataset.
Citation Information
Publication Year | 2023 |
---|---|
Title | Data to support Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies |
DOI | 10.5066/P9GNEN8S |
Authors | Jennifer C Murphy Blair, Jeffrey G Chanat |
Product Type | Data Release |
Record Source | USGS Asset Identifier Service (AIS) |
USGS Organization | Central Midwest Water Science Center |
Rights | This work is marked with CC0 1.0 Universal |
Related
Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies
Large multi-site trend studies provide an opportunity to evaluate progress of waterbodies towards water-quality goals across broad geographic areas. Such studies often aggregate the results of site-specific models and thus contend with evaluating each model for appropriate fit and statistical assumptions. We explored the use of four traditional machine learning models (logistic regression, linear
Authors
Jennifer C. Murphy, Jeffrey G. Chanat
Related
Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies
Large multi-site trend studies provide an opportunity to evaluate progress of waterbodies towards water-quality goals across broad geographic areas. Such studies often aggregate the results of site-specific models and thus contend with evaluating each model for appropriate fit and statistical assumptions. We explored the use of four traditional machine learning models (logistic regression, linear
Authors
Jennifer C. Murphy, Jeffrey G. Chanat