Skip to main content
U.S. flag

An official website of the United States government

Predicting redox-sensitive contaminant concentrations in groundwater using random forest classification

August 29, 2017

Machine learning techniques were applied to a large (n > 10,000) compliance monitoring database to predict the occurrence of several redox-active constituents in groundwater across a large watershed. Specifically, random forest classification was used to determine the probabilities of detecting elevated concentrations of nitrate, iron, and arsenic in the Fox, Wolf, Peshtigo, and surrounding watersheds in northeastern Wisconsin. Random forest classification is well suited to describe the nonlinear relationships observed among several explanatory variables and the predicted probabilities of elevated concentrations of nitrate, iron, and arsenic. Maps of the probability of elevated nitrate, iron, and arsenic can be used to assess groundwater vulnerability and the vulnerability of streams to contaminants derived from groundwater. Processes responsible for elevated concentrations are elucidated using partial dependence plots. For example, an increase in the probability of elevated iron and arsenic occurred when well depths coincided with the glacial/bedrock interface, suggesting a bedrock source for these constituents. Furthermore, groundwater in contact with Ordovician bedrock has a higher likelihood of elevated iron concentrations, which supports the hypothesis that groundwater liberates iron from a sulfide-bearing secondary cement horizon of Ordovician age. Application of machine learning techniques to existing compliance monitoring data offers an opportunity to broadly assess aquifer and stream vulnerability at regional and national scales and to better understand geochemical processes responsible for observed conditions.

Related Content