Skip to main content
U.S. flag

An official website of the United States government

January 9, 2017

Challenges remain in combining data from multiple organizations

A new U.S. Geological Survey study reports that almost 60 percent of previously collected nutrient water-quality records for U.S. rivers and streams have missing or ambiguous reference information. This inconsistency limits the use of these data for assessing water quality across large river basins.

The study found that nearly 14.5 million of the 25 million records collected since 1899 by nearly 500 public and private organizations at 321,927 sites across the country had missing or ambiguous metadata — the standard descriptive information needed to determine the amount of a chemical present in the sample.

“Because individual monitoring organizations understand their own data well, they are able to use the data locally to meet the original goals of data collection,” said Lori Sprague, USGS hydrologist and lead author of the study. “The problem arises when we try to combine data from multiple sources to assess water-quality conditions in large watersheds, such as the Potomac River or Mississippi River basins. Monitoring organizations often report the same metadata elements differently.”

Inconsistent information prevents water resources agencies from using massive amounts of data on a broader scale to assess the status and trends of our nation’s rivers. The adoption of standard metadata practices across all monitoring organizations in the United States could increase the amount of water data that can be used to assess water management actions in large watersheds, potentially leading to important water-quality insights that would not otherwise be possible.

The study found that metadata on whether specific water samples were filtered or unfiltered was inconclusive for nearly 12 million records. There were over 1.3 million records without data units or with inappropriate units. In addition, wide variations in chemical names and the use of abbreviations require expensive and time-consuming evaluations to determine which chemical was reported. For instance, there are 147 variations of names used for just one constituent, orthophosphate.

The USGS study assessed data collected by 488 monitoring organizations — 19 federal agencies; 6 regional (multi-state) organizations; 100 state water, natural resources, or environmental protection agencies; 130 tribal organizations; 108 county or subcounty organizations; 24 academic organizations; 17 non-governmental organizations; 34 volunteer organizations; and 50 private organizations.

The USGS is using these multi-agency data to assess long-term trends in water quality of American streams and rivers.

The National Water Quality Monitoring Council — federal, tribal, interstate, state, local, and municipal governments, watershed groups and national associations that include volunteer monitoring groups — is developing sets of water quality data elements to facilitate the exchange of water-quality data among multiple agencies.

The USGS study, “Challenges with secondary use of multi-source water-quality data in the United States,” can be found online in the journal Water Resources.

Additional information on water-quality monitoring and modeling is available on the USGS National Water-Quality Assessment project website.

Get Our News

These items are in the RSS feed format (Really Simple Syndication) based on categories such as topics, locations, and more. You can install and RSS reader browser extension, software, or use a third-party service to receive immediate news updates depending on the feed that you have added. If you click the feed links below, they may look strange because they are simply XML code. An RSS reader can easily read this code and push out a notification to you when something new is posted to our site.