Skip to main content
U.S. flag

An official website of the United States government

Beware of spatial autocorrelation when applying machine learning algorithms to borehole geophysical logs

January 31, 2021

Although many of the algorithms now considered to be machine learning algorithms (MLAs) have existed for nearly a century (e.g., Rosenblatt 1958), interest in MLAs has recently increased exponentially for solving data-driven problems across a variety of fields due to the expanded availability of large, complex datasets that may be difficult to interrogate using other methods, increases in computing power, and a growing library of easily implemented machine learning tools. While MLAs are often similar to statistical methods, there are key differences in the approach to problem solving. Namely, statistical methods are more concerned with generating informative models from “long” data (i.e., many more observations than explanatory variables), whereas MLAs are typically concerned with generating accurate predictions from “wide” data (i.e., a large number of variables with relatively fewer observations, Bzdok et al. 2018). In hydrogeologic studies, such wide datasets may be available from boreholes, where various types of geophysical, geochemical, and lithological information may exist. Borehole datasets are therefore a tempting target for MLAs to reveal hidden relations among gathered data and parameters of interest (e.g., contaminant concentration), and as a method of parameter reduction (e.g., reduce costs by collecting fewer datasets).

Citation Information

Publication Year 2021
Title Beware of spatial autocorrelation when applying machine learning algorithms to borehole geophysical logs
DOI 10.1111/gwat.13081
Authors Neil Terry, Carole D. Johnson, Frederick Day-Lewis, Beth L. Parker, Lee D. Slater
Publication Type Article
Publication Subtype Journal Article
Series Title Groundwater
Index ID 70221208
Record Source USGS Publications Warehouse
USGS Organization New York Water Science Center; WMA - Earth System Processes Division

Related Content