Integration of National Soil and Wetland Datasets: A Toolkit for Reproducible Calculation and Quality Assessment of Imputed Wetland Soil Properties

Science Center Objects

Wetland soils are vital to the Nation because of their role in sustaining water resources, supporting critical ecosystems, and sequestering significant concentrations of biologically-produced carbon. The United States has the world’s most detailed continent-scale digital datasets for soils and wetlands, yet scientists and land managers have long struggled with the challenge of integrating thes...

Wetland soils are vital to the Nation because of their role in sustaining water resources, supporting critical ecosystems, and sequestering significant concentrations of biologically-produced carbon.  The United States has the world’s most detailed continent-scale digital datasets for soils and wetlands, yet scientists and land managers have long struggled with the challenge of integrating these datasets for applications in research and in resource assessment and management.  The difficulties include spatial and temporal uncertainties, inconsistencies among data sources, and inherent structural complexities of the datasets.  This project’s objective was to develop and document a set of methods to impute wetland soil properties by integrating Soil Survey Geographic (SSURGO) data with the National Wetlands Inventory (NWI) and other data sources relevant to the extent and properties of wetlands.



The project methods build on the project team’s current research and development of best practices for analysis and application of soil and wetland data.  Documentation of the process is meant to assure complete transparency and reproducibility of imputed wetland soil properties, with a broad range of applications beyond the immediate interests of this project team.

Accomplishments

The project team combined the most recently available SSURGO dataset (mostly 1:24,000 map scale) with gap-filling from the generalized (1:250,000 map scale) Digital General Soil Map of the United States (STATSGO2) dataset.  The combined SSURGO/STATSGO2 dataset is assembled as a map layer with associated tables of relational attributes for each soil map unit.



Using the combined SSURGO/STATSGO2 dataset, the team identified and extracted wetland-related attributes of soil map units, components, and component horizons for all soil map units within the conterminous United States.  To facilitate imputation of wetland soil properties, the team represented these attributes in 10 spatially distinct map layers representing categorized occurrences of hydric soil components and soil flooding or ponding. The team also extracted 21 spatially distinct map layers from the NWI Wetlands Layer based on classification attributes related to soils and vegetation.  Each map layer retains the full NWI alphanumeric classification code for each wetland polygon.



Preliminary spatial integration of the extracted SSURGO/STATSGO2 and NWI datasets has shown that individual soil map units with wetland-associated attributes often do not align spatially with established wetland polygons.  The project team will continue to explore the spatial integration of these datasets, with emphasis on defining additional spatial relationships that will support imputation of wetland soil properties.



The project team created a geospatial data layer that supports imputation of SSURGO/STATSGO2 soil attributes for NWI wetlands based on a combination of spatial proximity and similarity of ecosystem and hydrographic settings.  This “imputation layer” represents a combination of map layers representing 4-digit hydrologic unit codes and LANDFIRE/Nature Conservancy existing ecosystem valuation toolkit (EVT) and biophysical settings.



The team is conducting rigorous quality tests on the datasets described above, including area checksums, searches for null and illogical values, comparisons to independent datasets, and assessments of reproducibility reflected in similar recent data extractions for other investigations.  These tests have revealed several unforeseen problems, which are now the primary focus of the team’s ongoing work on this project.



Finally, the team has initiated inquiries concerning an appropriate domain repository for the project’s datasets and accompanying documentation.  They had originally planned for long-term storage and access of the data through the Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC); however, it has recently become unclear whether the ORNL DAAC will continue archiving carbon-related datasets. The project team is waiting to learn more about this uncertainty before proceeding with data submission to the ORNL DAAC.







Note:  This description is from the Community for Data Integration 2016 Annual Report.