Skip to main content
U.S. flag

An official website of the United States government

Maximizing species distribution model performance when using historical occurrences and variables of varying persistency

March 9, 2022

Occurrence data used to build species distribution models often include historical records from locations in which the species no longer exists. When these records are paired with contemporary environmental values that no longer represent the conditions the species experienced, the model creates false associations that hurt predictive performance. The extent of mismatching increases with the number of historical occurrences and with inclusion of environmental variables that are prone to change over time. Indeed, the mismatch between occurrence data and contemporaneous environmental variables is a common dilemma when modeling rare or cryptic species, especially those of conservation concern that were once more abundant. Herein, we assess (1) the impact of historical occurrences on model performance across three sets of environmental variables of increasing persistency and (2) the performance of models built using selected-historical occurrences from locations that showed evidence of limited environmental change over time. Concepts are tested on federally listed flatwoods salamanders, reflecting real-world conservation management efforts. We predicted that, compared to other occurrence sets, (1) historical occurrences would perform best with environmental variables that were more persistent, (2) recent occurrences would perform best when the environmental variables were more impersistent, and that (3) our selected-historical occurrences would perform best with a combination of persistent and impersistent variables. Our results showed the expected inversion of model performance of recent and historical occurrences across environmental variables of increasing persistency when evaluated by correct predictions. However, the inversion was not seen in area under the curve performance, in which historical occurrences outperformed recent occurrence models across all variable sets. Selected-historical occurrences did not notably improve performance over all-historical occurrences in any metric or variable set. To maximize utility and performance, modelers could acknowledge potential trade-offs from inclusion of historical occurrences and consider number and age of recent and historical occurrences available, the persistency of environmental variables considered, and how their conservation goals are reflected in model design and evaluation, particularly with respect to sensitivity versus specificity. Our study lends support for inclusion of historical occurrences, with the potential exception of mostly impersistent variables when sensitivity is the highest priority.

Citation Information

Publication Year 2022
Title Maximizing species distribution model performance when using historical occurrences and variables of varying persistency
DOI 10.1002/ecs2.3951
Authors Jason T. Bracken, Amelie Y. Davis, Katherine O'Donnell, William Barichivich, Susan C. Walls, Tereza Jezkova
Publication Type Article
Publication Subtype Journal Article
Series Title Ecosphere
Index ID 70230166
Record Source USGS Publications Warehouse
USGS Organization Wetland and Aquatic Research Center