Skip to main content
U.S. flag

An official website of the United States government

Discovering hidden geothermal signatures using non-negative matrix factorization with customized k-means clustering

October 11, 2022

Discovery of hidden geothermal resources is challenging. It requires the mining of large datasets with diverse data attributes representing subsurface hydrogeological and geothermal conditions. The commonly used play fairway analysis approach typically incorporates subject-matter expertise to analyze regional data to estimate geothermal characteristics and favorability. We demonstrate an alternative approach based on machine learning (ML) to process a geothermal dataset from southwest New Mexico (SWNM). The study region includes low- and medium-temperature hydrothermal systems. Several of these systems are not well characterized because of insufficient existing data and limited past explorative work. This study discovers hidden patterns and relations in the SWNM geothermal dataset to improve our understanding of the regional hydrothermal conditions and energy-production favorability. This understanding is obtained by applying an unsupervised ML algorithm based on non-negative matrix factorization coupled with customized k-means clustering (NMFk). NMFk can automatically identify (1) hidden signatures characterizing analyzed datasets, (2) the optimal number of these signatures, (3) the dominant data attributes associated with each signature, and (4) the spatial distribution of the extracted signatures. Here, NMFk is applied to analyze 18 geological, geophysical, hydrogeological, and geothermal attributes at 44 locations in SWNM. Using NMFk, we find data patterns and identify the spatial associations of hydrothermal signatures within two physiographic provinces (Colorado Plateau and Basin and Range) and two sub-regions of these provinces (the Mogollon-Datil volcanic field and the Rio Grande rift) in SWNM. The ML algorithm extracted five hydrothermal signatures in the SWNM datasets that differentiate between low (<90) and medium (90-150)-temperature hydrothermal systems. The algorithm also suggests that the Rio Grande rift and northern Mogollon-Datil volcanic field are the most favorable regions for future geothermal resource discovery. NMFk also identified critical attributes to identify medium-temperature hydrothermal systems in the study area. The resulting NMFk model can be applied to predict geothermal conditions and their uncertainties at new SWNM locations based on limited data from unexplored regions. The code to execute the performed analyses as well as the corresponding data can be found at

Publication Year 2022
Title Discovering hidden geothermal signatures using non-negative matrix factorization with customized k-means clustering
DOI 10.1016/j.geothermics.2022.102576
Authors Velimir V. Vesselinov, Bulbul Ahmmed, Maruti K. Mudunuru, Jeff D. Pepin, Erick R. Burns, Drew L. Siler, Satish Karra, Richard S. Middleton
Publication Type Article
Publication Subtype Journal Article
Series Title Geothermics
Index ID 70237376
Record Source USGS Publications Warehouse
USGS Organization New Mexico Water Science Center