Don’t Let Negatives Hold You Back: Accounting for Underlying Physics and Natural Distributions of Hydrothermal Systems When Selecting Negative Training Sites Leads to Better Machine Learning Predictions
December 29, 2023
Selecting negative training sites is an important challenge to resolve when utilizing machine learning (ML) for predicting hydrothermal resource favorability because ideal models would discriminate between hydrothermal systems (positives) and all types of locations without hydrothermal systems (negatives). The Nevada Machine Learning project (NVML) fit an artificial neural network to identify areas favorable for hydrothermal systems by selecting 62 negative sites where the research team had confidence that no hydrothermal resource exists. Herein, we compare the implications of the expert selection of negatives (i.e., the NVML strategy) with a random sample strategy, where it is assumed that areas outside the favorable structural ellipses defined by NVML are negative. Because hydrothermal systems are sparse, it is highly probable that, in the absence of a favorable geological structure, hydrothermal favorability is low. We compare three training strategies: 1) the positive and negative labeled examples from NVML; 2) the positive examples from NVML with randomly selected negatives in equal frequency as NVML; and 3) the positive examples from NVML with randomly selected negatives reflecting the expected natural distribution of hydrothermal systems relative to the total area. We apply these training strategies to the NVML feature data (input data) using two ML algorithms (XGBoost and logistic regression) to create six favorability maps for hydrothermal resources. When accounting for the expected natural distribution of hydrothermal systems, we find that XGBoost performs better than the NVML neural network and its negatives. Model validation was less reliable using F1 scores, a common performance metric, than comparing probability estimates at known positives, likely because of the extreme natural class imbalance and the lack of negatively labeled sites. This work demonstrates that expert selection of negatives for training in NVML likely imparted modeling bias. Accounting for the sparsity of hydrothermal systems and all the types of locations without hydrothermal systems allows us to create better models for predicting hydrothermal resource favorability.
Citation Information
Publication Year | 2023 |
---|---|
Title | Don’t Let Negatives Hold You Back: Accounting for Underlying Physics and Natural Distributions of Hydrothermal Systems When Selecting Negative Training Sites Leads to Better Machine Learning Predictions |
Authors | Pascal D. Caraccioli, Stanley Paul Mordensky, Cary Ruth Lindsey, Jacob DeAngelo, Erick R. Burns, John Lipor |
Publication Type | Article |
Publication Subtype | Journal Article |
Series Title | Geothermal Resources Council Transactions |
Index ID | 70251035 |
Record Source | USGS Publications Warehouse |
USGS Organization | Geology, Minerals, Energy, and Geophysics Science Center |
Related
Geothermal Resource Investigations Project
Geothermal energy is a significant source of renewable electric power in the western United States and, with advances in exploration and development technologies, a potential source of a large fraction of baseload electric power for the entire country. This project focuses on advancing geothermal research through a better understanding of geothermal resources and the impacts of geothermal...
Erick R Burns
Research Hydrologist
Research Hydrologist
Email
Phone
Related
Geothermal Resource Investigations Project
Geothermal energy is a significant source of renewable electric power in the western United States and, with advances in exploration and development technologies, a potential source of a large fraction of baseload electric power for the entire country. This project focuses on advancing geothermal research through a better understanding of geothermal resources and the impacts of geothermal...
Erick R Burns
Research Hydrologist
Research Hydrologist
Email
Phone