Skip to main content
U.S. flag

An official website of the United States government

Data for Machine Learning Predictions of Nitrate in Groundwater Used for Drinking Supply in the Conterminous United States

November 1, 2021

A three-dimensional extreme gradient boosting (XGB) machine learning model was developed to predict the distribution of nitrate in groundwater across the conterminous United States (CONUS). Nitrate was predicted at a 1-square-kilometer (km) resolution for two drinking water zones, each of variable depth, one for domestic supply and one for public supply. The model used measured nitrate concentrations from 12,082 wells, and included predictor variables representing well characteristics, hydrologic conditions, soil type, geology, land use, climate, and nitrogen inputs. Predictor variables derived from empirical or numerical process-based models were also included to integrate information on controlling processes and conditions. This data release documents the model and provides the model results. The model and results are discussed in the associated journal article, Ransom and others (2021). Included in this data release are, 1) a model archive of the R project: source code, input files (including model training and hold-out data, rasters of all final predictor variables, and rasters representing domestic and public supply depth zones), and output files (output files are two rasters of predicted nitrate concentration at the depth zones typical of domestic and public supply wells), 2) a read_me file describing the model archive and an explanation of its use, and 3) tables describing model variables, model fit statistics, and model results [these tables are also included in the Supporting Information published with the journal article Ransom and others (2021)].

Related Content