Skip to main content
U.S. flag

An official website of the United States government

Satellite-Derived Training Data for Automated Flood Detection in the Continental U.S.

December 15, 2020

Remotely sensed imagery is increasingly used by emergency managers to monitor and map the impact of flood events to support preparedness, response, and critical decision making throughout the flood event lifecycle. To reduce latency in delivery of imagery-derived information, ensure consistent and reliably derived map products, and facilitate processing of an increasing volume of remote sensing data-streams, automated flood mapping workflows are needed. The U.S. Geological Survey is facilitating the development and integration of machine-learning algorithms in collaboration with NASA, National Geospatial Intelligence Agency (NGA), University of Alabama, and University of Illinois to create a workflow for rapidly generating improved flood-map products. A major bottleneck to the training of robust, generalizable machine learning algorithms for pattern recognition is a lack of training data that is representative across the landscape. To overcome this limitation for the training of algorithms capable of detection of surface inundation in diverse contexts, this publication includes the data developed from MAXAR Worldview sensors that is input as training data for machine learning. This data release consists of 100 thematic rasters, in GeoTiff format, with image labels representing five discrete categories: water, not water, maybe water, clouds and background/no data. Specifically, these training data were created by labeling 8-band, multispectral scenes from the MAXAR-Digital Globe, Worldview-2 and 3 satellite-based sensors. Scenes were selected to be spatially and spectrally diverse and geographically representative of different water features within the continental U.S. The labeling procedures used a hybrid approach of unsupervised classification for the initial spectral clustering, followed by expert-level manual interpretation and QA/QC peer review to finalize each labeled image. The 100 raster files that make up the training data are available to download here (doi:10.5066/P9C7HYRV).