Skip to main content
U.S. flag

An official website of the United States government

Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation

March 18, 2022

Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or 'label images') collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (<=1m) orthomosaics and satellite image tiles (10-30m). Each image, image annotation, and labelled image is available as a single zipped file. Each zipped file contains a folder of NPZ format files, and a csv file containing metadata for each labeled image. Zipped folders files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image, and a CSV file with metadata information for every image. An individual NPZ file is named after the image that it represents and contains a collection of the following variables: orig_image (original input image unedited), image (original input image after color balancing and normalization), classes (list of classes annotated and present in the labelled image), doodles (integer image of all image annotations), color_doodles (color image of doodles), label (labelled image created from the classes present in the annotations), and settings (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022; https://doi.org/10.5066/P9YVHL23), a process documented on the Coast Train website (https://dbuscombe-usgs.github.io/CoastTrain/docs/).

Publication Year 2022
Title Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation
DOI 10.5066/P91NP87I
Authors Phillipe A Wernette, Daniel D Buscombe, Jaycee Favela, Sharon N Fitzpatrick, Evan Goldstein, Nicholas M Enwright, Erin Dunand
Product Type Data Release
Record Source USGS Digital Object Identifier Catalog
USGS Organization Pacific Coastal and Marine Science Center