Skip to main content
U.S. flag

An official website of the United States government

Random forest regression model and prediction rasters of fluoride in groundwater in basin-fill aquifers of western United States

November 2, 2021

A random forest regression (RFR) model was developed to predict groundwater fluoride concentrations in four western United Stated principal aquifers - California Coastal basin-fill aquifers, Central Valley aquifer system, Basin and Range basin-fill aquifers, and the Rio Grande aquifer system. The selected basin-fill aquifers are a vital resource for drinking-water supplies. The RFR model was developed with a dataset of over 12,000 wells sampled for fluoride between 2000 and 2018. This data release provides rasters of predicted fluoride concentrations at depth typical of domestic and public supply wells in the selected basin-fill aquifers and includes the final RFR model that documents the prediction modeling process and verifies and reproduces the model fit metrics and mapped predictions in the accompanying publication. Included in this data release are 1) a model archive of the R project including source code, input files (model training and testing data and rasters of predictor variables), output files (rasters of predicted fluoride at depth typical of domestic and supply wells, respectively), 2) a read_me file describing the model archive and explanation of use, 3) a Supporting_GIS_information.csv file describing model variables and source data, and 4), this metadata record.