Skip to main content
U.S. flag

An official website of the United States government

Mapped predictions of manganese and arsenic in an alluvial aquifer using boosted regression trees

December 24, 2021

Manganese (Mn) concentrations and the probability of arsenic (As) exceeding the drinking-water standard of 10 μg/L were predicted in the Mississippi River Valley alluvial aquifer (MRVA) using boosted regression trees (BRT). BRT, a type of ensemble-tree machine-learning model, were created using predictor variables that affect Mn and As distribution in groundwater. These variables included iron (Fe) concentrations and specific conductance predicted from previously developed BRT models, groundwater flux and age estimates from MODFLOW, and hydrologic characteristics. The models also included results from the first airborne geophysical survey conducted in the United States to target an entire aquifer system. Predictions of high Mn and As occurred where Fe was high. Predicted high Mn concentrations were correlated with fraction of young groundwater (less than 65 years) computed from MODFLOW results. High probabilities of As exceedance were predicted where groundwater was relatively old and airborne electromagnetic resistivity was high, typically proximal to streams. Two-variable partial-dependence plots and sensitivity analysis were used to provide insight into the factors controlling Mn and As distribution in groundwater. The maps of predicted Mn concentrations and As exceedance probabilities can be used to identify areas where these constituents may be high, and that could be targeted for further study. This paper shows that incorporation of a selected set of process-informed data, such as MODFLOW results and airborne geophysics, into a machine-learning model improves model interpretability. Incorporation of process-rich information into machine-learning models will likely be useful for addressing a wide range of problems of interest to groundwater hydrologists.