Skip to main content
U.S. flag

An official website of the United States government

Optimizing selection of training and auxiliary data for operational land cover classification for the LCMAP initiative

November 23, 2016

The U.S. Geological Survey’s Land Change Monitoring, Assessment, and Projection (LCMAP) initiative is a
new end-to-end capability to continuously track and characterize changes in land cover, use, and condition
to better support research and applications relevant to resource management and environmental
change. Among the LCMAP product suite are annual land cover maps that will be available to the public.
This paper describes an approach to optimize the selection of training and auxiliary data for deriving the
thematic land cover maps based on all available clear observations from Landsats 4–8. Training data were
selected from map products of the U.S. Geological Survey’s Land Cover Trends project. The Random Forest
classifier was applied for different classification scenarios based on the Continuous Change Detection and
Classification (CCDC) algorithm. We found that extracting training data proportionally to the occurrence
of land cover classes was superior to an equal distribution of training data per class, and suggest using a
total of 20,000 training pixels to classify an area about the size of a Landsat scene. The problem of unbalanced
training data was alleviated by extracting a minimum of 600 training pixels and a maximum of
8000 training pixels per class. We additionally explored removing outliers contained within the training
data based on their spectral and spatial criteria, but observed no significant improvement in classification
results. We also tested the importance of different types of auxiliary data that were available for the conterminous
United States, including: (a) five variables used by the National Land Cover Database, (b) three
variables from the cloud screening ‘‘Function of mask” (Fmask) statistics, and (c) two variables from the
change detection results of CCDC. We found that auxiliary variables such as a Digital Elevation Model and
its derivatives (aspect, position index, and slope), potential wetland index, water probability, snow probability,
and cloud probability improved the accuracy of land cover classification. Compared to the original
strategy of the CCDC algorithm (500 pixels per class), the use of the optimal strategy improved the classification
accuracies substantially (15-percentage point increase in overall accuracy and 4-percentage
point increase in minimum accuracy).

Publication Year 2016
Title Optimizing selection of training and auxiliary data for operational land cover classification for the LCMAP initiative
DOI 10.1016/j.isprsjprs.2016.11.004
Authors Zhe Zhu, Alisa L. Gallant, Curtis Woodcock, Bruce Pengra, Pontus Olofsson, Thomas R. Loveland, Suming Jin, Devendra Dahal, Limin Yang, Roger F. Auch
Publication Type Article
Publication Subtype Journal Article
Series Title ISPRS Journal of Photogrammetry and Remote Sensing
Index ID 70178529
Record Source USGS Publications Warehouse
USGS Organization Earth Resources Observation and Science (EROS) Center