This data release provides the data and R scripts used for the 2017 submitted publication titled "Improving predictions of hydrological low-flow indices in ungaged basins using machine learning". There are two .csv files and 14 R-scripts included below. The lowflow_sc_ga_al_gagesII_2015.csv datafile contains the annual minimum seven-day mean streamflow with an annual exceedance probability of 90% (7Q10) for 224 basins in South Carolina, Georgia, and Alabama. The datafile also contains 231 basin characteristics from the Gages II dataset (https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011). The "all_preds.csv" file contains the leave-one-out cross validated predictions for all the models. The paper associated with the data release compares the ability of eight machine-learning models (elastic net, gradient boosting, kernel-k-nearest neighbors, two variants of support vector machines, M5-cubist, random forest, and a meta-learning ensemble M5-cubist model) and four baseline models (ordinary kriging, a unit-area discharge model, and two variants of censored regression) to generate estimates of the 7Q10 at 224 unregulated sites in South Carolina, Georgia, and Alabama.
|Title||7Q10 records and basin characteristics for 224 basins in South Carolina, Georgia, and Alabama (2015)|
|Authors||Worland Scott C., Farmer William H., Kiang Julie E|
|Product Type||Data Release|
|Record Source||USGS Digital Object Identifier Catalog|
|USGS Organization||Lower Mississippi-Gulf Water Science Center|