Species distribution models (SDMs) are valuable for rare species conservation and are commonly used to extrapolate predictions of habitat suitability geographically to regions where species occurrence is unknown (i.e., transferability). Spatially structured cross-validation can be used to infer transferability, yet, few studies have evaluated how delineation of cross-validation folds affects model complexity and predictions. We developed SDMs using multiple cross-validation approaches to understand the implications for predicting habitat suitability for northern Idaho ground squirrels, a rare, federally threatened species that has been extensively surveyed in regions where known populations occur, resulting in >8000 presence locations.
We delineated cross-validation folds by mimicking the manner in which predictions would be geographically extrapolated or by using existing dispersal barriers. We varied the distance between, number, and directionality of folds. We conducted a grid search on statistical regularization parameters to optimize model complexity, covering a range of values exceeding that typically implemented. For each cross-validation approach, we selected optimal regularization and model complexity based on out-of-sample predictive ability.
Delineation of cross-validation folds substantially affected resulting model complexity and extrapolated predictions. All cross-validation approaches resulted in models with apparently high out-of-sample predictive ability, yet optimal model complexity varied substantially among the approaches. Regularization demonstrated a noisy relationship between model complexity and prediction, where local optima in predictive performance were common at small values.
Subtle modelling decisions can have large consequences for predictions of habitat suitability and transferability of SDMs. When transferability is the goal, cross-validation approaches should be considered carefully and mimic the manner in which spatial extrapolation will occur, else overly complex models with inflated assessments of predictive accuracy may result. Further, spatially structured cross-validation may not guard against over-parameterization, and assessing a broader range of regularization parameters may be necessary to optimize model complexity for transferability.
|Title||Balancing transferability and complexity of species distribution models for rare species conservation|
|Authors||Nolan A. Helmstetter, Courtney J. Conway, Bryan S. Stevens, Amanda R. Goldberg|
|Publication Subtype||Journal Article|
|Series Title||Diversity and Distributions|
|Record Source||USGS Publications Warehouse|
|USGS Organization||Coop Res Unit Seattle|