Quantifying and understanding the natural streamflow regime, defined as expected streamflow that would occur in the absence of anthropogenic modification to the hydrologic system, is critically important for the development of management strategies aimed at protecting aquatic ecosystems. Water balance models have been applied frequently to estimate natural flows, but are limited in the number of predictor variables that can be included. Here, a statistical machine learning technique — random forest modeling — was applied to estimate natural flows at a monthly time‐step from 1950 to 2015 for >2.5 million stream reaches in the conterminous United States (U.S.) using 200 potential predictor variables. We describe the development and documentation of this dataset and assess model performance. Model fit statistics (mean Nash–Sutcliffe efficiency = 0.85; observed/expected ratio = 0.94) indicate good correspondence between predicted and observed flows at nearly 2,000 streamgages. As an example application of the dataset, the observed streamflow record at a site prior to and after the construction of an upstream reservoir was compared with estimated natural flows to demonstrate the magnitude of seasonal depletions in streamflow due to the reservoir. This dataset can be applied to quantify natural and anthropogenic processes contributing to streamflow depletion or augmentation, and assess associated ecological effects.
- Digital Object Identifier: 10.1111/1752-1688.12685
- Source: USGS Publications Warehouse (indexId: 70199409)