Machine learning models of arsenic in private wells throughout the conterminous United States as a tool for exposure assessment in human health studies

March 17, 2021

Arsenic from geologic sources is widespread in groundwater within the United States (U.S.). In several areas, groundwater arsenic concentrations exceed the U.S. Environmental Protection Agency maximum contaminant level of 10 μg per liter (μg/L). However, this standard applies only to public-supply drinking water and not to private-supply, which is not federally regulated and is rarely monitored. As a result, arsenic exposure from private wells is a potentially substantial, but largely hidden, public health concern. Machine learning models using boosted regression trees (BRT) and random forest classification (RFC) techniques were developed to estimate probabilities and concentration ranges of arsenic in private wells throughout the conterminous U.S. Three BRT models were fit separately to estimate the probability of private well arsenic concentrations exceeding 1, 5, or 10 μg/L whereas the RFC model estimates the most probable category (≤5, >5 to ≤10, or >10 μg/L). Overall, the models perform best at identifying areas with low concentrations of arsenic in private wells. The BRT 10 μg/L model estimates for testing data have an overall accuracy of 91.2%, sensitivity of 33.9%, and specificity of 98.2%. Influential variables identified across all models included average annual precipitation and soil geochemistry. Models were developed in collaboration with public health experts to support U.S.-based studies focused on health effects from arsenic exposure.

Publication Year	2021
Title	Machine learning models of arsenic in private wells throughout the conterminous United States as a tool for exposure assessment in human health studies
DOI	10.1021/acs.est.0c05239
Authors	Melissa Lombard, Molly Scannell Bryan, Daniel Jones, Catherine Bulka, Paul M. Bradley, Lorraine C. Backer, Michael J. Focazio, Debra T. Silverman, Patricia Toccalino, Maria Argos, Matthew O. Gribble, Joseph D. Ayotte
Publication Type	Article
Publication Subtype	Journal Article
Series Title	Environmental Science and Technology
Index ID	70219045
Record Source	USGS Publications Warehouse
USGS Organization	New England Water Science Center; Toxic Substances Hydrology Program; WMA - Office of Planning and Programming

Machine learning models of arsenic in private wells throughout the conterminous United States as a tool for exposure assessment in human health studies

Research Hydrologist

Physical Scientist

Research Ecologist/Hydrologist

Environmental Health Program Coordinator

Deputy Regional Director

Supervisory Hydrologist

Research Hydrologist

Physical Scientist

Research Ecologist/Hydrologist

Environmental Health Program Coordinator

Deputy Regional Director

Supervisory Hydrologist

New England Water Science Center

Ecosystems Mission Area Headquarters

U.S. Geological Survey

U.S. Department of the Interior

Machine learning models of arsenic in private wells throughout the conterminous United States as a tool for exposure assessment in human health studies

Citation Information

Related Content

Research Hydrologist

Physical Scientist

Research Ecologist/Hydrologist

Environmental Health Program Coordinator

Deputy Regional Director

Supervisory Hydrologist

Related Content

Research Hydrologist

Physical Scientist

Research Ecologist/Hydrologist

Environmental Health Program Coordinator

Deputy Regional Director

Supervisory Hydrologist