Aqua INHABIT species potential distribution across the contiguous United States
This is a dataset containing the potential distribution of 103 invasive species found in or adjacent to freshwater environments. We developed habitat suitability models for invasive freshwater species selected by resource management agencies and other managers. We adapted the modeling workflow described in Jarnevich et al. (2024, https://doi.org/10.3897/neobiota.96.134842). We developed a national library of environmental variables known to physiologically limit freshwater species distributions (Henderson et al. 2025, https://doi.org/10.5066/P14JDTTJ) and relied on human input based on natural history knowledge to narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms (boosted regression tree, generalized linear model, multi-variate adaptive regression spline, maxent, and random forest) with VisTrails: Software for Assisted Habitat Modeling (SAHM 2.2.2, Morisette et al., 2013: https://doi.org/10.1111/j.1600-0587.2012.07815.x). For each species, we generated models for waterbody and stream environments where there were enough data available in each system. For several predictors we had future scenarios available, including up to five alternatives. We combined stream and waterbody models into a single map representing freshwater environments using a weighted ensemble across algorithms, including current and alternative future conditions. We also calculated subwatersheds (Hydrologic Unit Code [HUC] 12) summaries of information related to pathways of spread, finally integrating this information with current and alternative future habitat suitability to produce a risk score for each HUC12 east of the Mississippi River. This data bundle contains a single file of tabular summaries by management unit (including each species/ensemble type/abundance level combination), the merged data sets used to create models, tabular outputs including response curve data, variable importance information, model assessment metrics, a single file of pathways and risk scores, a spatial vector layer of the subwatersheds to which pathways and risk are summarized, a species metadata file, and R scripts to produce the model inputs and final products.
The bundle documentation files are:
1) 'AquaINHABIT_V1_metadata.xml' (this file) which contains the project-level metadata.
2) 'Species_model_information.csv' contains information on specific model changes of each species from tuning algorithm parameters to ensure model quality.
3) 'Merged_dataset.csv' contains the merged data set used to create the models, including location and associated environmental data, for each species.
4) Variable_importance.csv is the tabular summaries indicating predictor importance for each of the models produced for each species.
5) Assessment_metrics.csv is the tabular summaries of assessment metrics for each model or ensemble for each species.
6) Risk_and_pathways_scores.csv contains summarized values for pathways, current suitability, future suitability, and a combined risk for each species by HUC12.
7) WBD_HUC12_EasternUS.gpkg is the spatial vector file of the HUC12 boundaries used for analyses and summarizations.
8) Rcode.zip is a zipped file of the R scripts used to pull and prep the data included in the merged dataset, code to make the derived raster outputs, and code to calculate the pathway and suitability summaries along with the final risk score.
There are also three child items.
1) “response curves” child items contains files of XX_response_curves.csv, the tabular information needed to produce response curves for each predictor retained in each of the up to 10 models produced for each species, where XX represents the code for the species from 'Species_model_information.csv', all within the response curves child item (grouped by taxa).
2) "management summaries" child item contains files Management_summaries_XX.csv which are the tabular summaries by management area, where XX indicates the management area group. The management area group includes HUC12 and US Army Corps of Engineers (USACE) project areas, with 1 file for USACE and HUC12 broken out into HUC2 watersheds (01 to 18)
3) "rasters" child items (grouped by taxa) contains raster files XX_YY.tif where XX is the code for the species from 'Species_model_information.csv' and YY is the raster type including seven rasters for each species:
1) Current occurrence suitability - Continuous value ensemble (XX-ens-current-mean .tif)
2) Restricted current occurrence suitability - Continuous value ensemble with restricted environmental conditions* (XX-ens-current-mean-masked.tif)
3) Future occurrence suitability - Continuous value ensemble (XX-ens-future-mean.tif)
4) Maximum future occurrence suitability - Maximum continuous value ensemble from the five alternative scenarios (XX-ens-future-max-gcm.tif)
5) Minimum future occurrence suitability - Minimum continuous value ensemble from the five alternative scenarios (XX-ens-future-min-gcm.tif)
6) Standard deviation of future occurrence suitability - Standard deviation of continuous value from each algorithm for the five alternative scenarios (XX-ens-future-stdev.tif)
7) Restricted count - Count of ensembles from each alternative scenario with restricted environmental conditions* (XX-gcm-mask-count.tif)
*Restricted environmental conditions = only display areas where environmental characteristics are inside the range of the values used to develop the model. For example, a location with a minimum winter temperature of 12 C would be outside the range of -10 to 10 C used in model development.
These data will be integrated into the first version of AquaINHABIT, a web application displaying visual and statistical summaries of freshwater habitat suitability models for manager identified invasive species. These species include: Aldrovanda vesiculosa, Alisma plantago-aquatica, Alosa pseudoharengus, Alternanthera philoxeroides, Ambloplites rupestris, Ameiurus catus, Ameiurus melas, Ameiurus natalis, Arundo donax, Astronotus ocellatus, Azolla cristata, Azolla pinnata, Bithynia tentaculata, Butomus umbellatus, Bythotrephes longimanus, Cabomba caroliniana, Callitriche stagnalis, Canna glauca, Carassius auratus, Carassius gibelio, Channa argus, Chrosomus oreas, Cipangopaludina chinensis, Cipangopaludina japonica, Clarias batrachus, Colocasia esculenta, Corbicula fluminea, Crassula helmsii, Ctenopharyngodon idella, Cyperus blepharoleptos, Cyperus papyrus, Cyprinella lutrensis, Cyprinus carpio, Daphnia lumholtzi, Didymosphenia geminata, Dorosoma cepedianum, Dorosoma petenense, Dreissena bugensis, Dreissena polymorpha, Echinogammarus ischnus, Egeria densa, Egeria najas, Eichhornia azurea, Eichhornia crassipes, Eichhornia paniculata, Eubosmina coregoni , Fallopia japonica, Faxonius rusticus, Faxonius virilis, Gambusia affinis, Gambusia holbrooki, Glyceria maxima, Gymnocephalus cernua, Hemimysis anomala, Hottonia palustris, Hydrilla verticillata, Hydrilla verticillata peregrina, Hydrilla verticillata verticillata, Hydrocharis morsus-ranae, Hydrocotyle ranunculoides, Hygrophila polysperma, Hypophthalmichthys molitrix, Hypophthalmichthys nobilis, Ictalurus furcatus, Ipomoea aquatica, Iris pseudacorus, Lagarosiphon major, Landoltia punctata, Lepomis gulosus, Limnobium laevigatum, Limnobium spongia, Limnocharis flava, Limnophila indica, Limnophila sessiliflora, Ludwigia decurrens, Ludwigia grandiflora, Ludwigia hexapetala, Ludwigia octovalvis, Ludwigia peploides, Ludwigia peruviana, Lythrum hyssopifolia, Lythrum portula, Lythrum salicaria, Marsilea quadrifolia, Melaleuca quinquenervia, Melanoides tuberculata, Mentha aquatica, Micropterus henshalli, Misgurnus anguillicaudatus, Monochoria vaginalis, Monopterus albus, Murdannia keisak, Mylopharyngodon piceus, Myosotis scorpioides, Myriophyllum aquaticum, Myriophyllum heterophyllum, Myriophyllum sibiricum, Myriophyllum spicatum, Najas marina, Najas minor, Nasturtium microphyllum, Nasturtium officinale, Nelumbo lutea, Nelumbo nucifera, Neogobius melanostomus, Nitellopsis obtusa, Nymphaea alba, Nymphaea lotus, Nymphaea mexicana, Nymphoides cristata, Nymphoides grayana, Nymphoides indica, Nymphoides peltata, Oreochromis aureus, Oreochromis niloticus, Oryza sativa, Panicum repens, Persicaria hydropiper, Phalaris arundinacea, Phragmites australis australis, Pistia stratiotes, Pomacea canaliculata, Pomacea canaliculata, Pomacea maculata, Potamogeton crispus, Potamopyrgus antipodarum, Procambarus clarkii, Proterorhinus semilunaris, Pterygoplichthys anisitsi, Pterygoplichthys disjunctivus, Pterygoplichthys multiradiatus, Pterygoplichthys pardalis, Pylodictis olivaris, Rotala indica, Rotala rotundifolia, Sagittaria guayanensis, Sagittaria montevidensis, Salmo trutta, Salvinia auriculata, Salvinia minima, Salvinia molesta, Salvinia natans, Salvinia oblongifolia, Sander lucioperca, Scardinius erythrophthalmus, Sporobolus anglicus, Stratiotes aloides, Tamarix chinensis and ramosissima, Tamarix sp, Tinca tinca, Trapa bispinosa, Trapa natans, Typha domingensis, Utricularia inflata, and Veronica anagallis-aquatica.
Citation Information
| Publication Year | 2026 |
|---|---|
| Title | Aqua INHABIT species potential distribution across the contiguous United States |
| DOI | 10.5066/P13JMOQW |
| Authors | Catherine S Jarnevich, Peder S Engelstad, Demetra (Contractor) A Williams, Keana (Contractor) S Shadwell, Cameron (Contractor) J Reimer, Grace (Contractor) C Henderson, Linnea (Contractor) S Fraser, Shelby (Contractor) K Leclare, Rich D Inman, Ian A Pfingsten, Wesley M Daniel |
| Product Type | Data Release |
| Record Source | USGS Asset Identifier Service (AIS) |
| USGS Organization | Fort Collins Science Center |
| Rights | This work is marked with CC0 1.0 Universal |