Skip to main content
U.S. flag

An official website of the United States government

Ecological dissimilarity matters more than geographical distance when predicting land surface indicators using machine learning

May 22, 2024
Supervised training techniques, such as those used in machine learning, use generally large sets of in situ data to train models that can, in turn, be used to make predictions (or prediction maps) about the Earth’s surface in times or places where no in situ data exist. The purpose of the present study is to investigate, using a very large set of in situ data from across the western United States (U.S.), the conditions under which training data from a different geographic region where predictions are desired may be substituted. To do this, we train models using in situ data from level IV ecoregions and test how well these models predict surface conditions in different ecoregions. We characterize the difference between the possible pairs of ecoregion in terms of geographical (centroid-to-centroid) distance and “ecological dissimilarity.” Ecological dissimilarity between pairs of ecoregions is defined in two ways: 1) as the Euclidean distance in multivariate space defined by in situ indicators designed for monitoring purposes and 2) in terms of the difference in temporal behavior from model- and remote sensing-derived datasets. Although, overall, prediction error increases with geographical distance between training and testing ecoregions, our results indicate that ecological dissimilarity can be used to predict the error expected from a model trained with data from one ecoregion when applied in a different ecoregion.
Publication Year 2024
Title Ecological dissimilarity matters more than geographical distance when predicting land surface indicators using machine learning
DOI 10.1109/TGRS.2024.3404240
Authors Bo Zhou, Gregory S. Okin, Junzhe Zhang, Shannon L. Savage, Christopher J. Cole, Michael C. Duniway
Publication Type Article
Publication Subtype Journal Article
Series Title IEEE Transactions on Geoscience and Remote Sensing
Index ID 70259161
Record Source USGS Publications Warehouse
USGS Organization Southwest Biological Science Center
Was this page helpful?