USGS Burned Area Products Group in Denver Touts Value of Landsat ARD

By Earth Resources Observation and Science (EROS) Center November 9, 2018

For all the great Federal records and remotely sensed products out there that have documented fires across the United States through the decades, it seems almost none have consistently and comprehensively mapped those burned areas across time and space.

At least not until now.

Time series of the Thomas Fire, created using Landsat ARD — An example of Landsat ARD being used in a Burned Area Product, focused on the 2017 Thomas Fire in California. (Public domain.)

Scientists at the USGS Geosciences and Environmental Change Science Center in Denver, CO, have developed and validated what they call the Burned Area (BA) algorithm. This computer wizardry identifies burned areas in images across the history of Landsat’s rich and dense archive. From that information comes Burned Area Products that provide new and unique information about spatial and temporal patterns of fire occurrence that many existing fire databases may have missed.

But it gets even better—thanks to an initiative at the Earth Resources Observation and Science (EROS) Center that resulted in Landsat Analysis Ready Data (ARD).

When EROS reconditioned the most recent 36 years of the Landsat archive into ARD, Todd Hawbaker and his colleagues at the center in Denver gained access to the best imagery available throughout the archive, processed to a common tiling scheme. All the data have top of atmosphere and surface reflectance already figured in.

That means the long days of preprocessing data to prepare them for mapping burns are largely over. It also offers the prospect of pushing out Burned Area Products virtually as quickly as the ARD becomes available—almost assuredly a boon to trying to understand what actually burned on the landscape and, just as importantly, assisting mitigation efforts seeking to limit sedimentation and debris flows after the fires die out.

Landsat ARD is a "huge step forward for us"

Hawbaker, a Research Ecologist at the Denver center, says that even with a two-week lag between the time it takes for Landsat 8 to acquire imagery, for Level-1 to be processed to Tier 1, and for ARD to be calculated and then ready for the BA algorithm, “it’s still better than in the past, where we’ve been two years behind.”

“There’s no question,” he added, “that this is a huge step forward for us.”

The fact that uncertainty exists in datasets about what has actually burned across America’s landscapes—and whether or not information in those datasets accurately represents patterns of burning on those landscapes—presents a number of challenges, Hawbaker said.

“Challenges if you’re trying to relate patterns of burning to, say, policy or weather patterns or vegetation patterns,” he said. “Those inconsistencies over time will drive your model results as much as the actual real drivers to those patterns of fire. So, the Burned Area Products were developed in response to that.”

There certainly is nothing intentional about the oversights of other datasets, Hawbaker added. A lot of databases are based on information gathered by incident response teams, often firefighters. In large fires, multiple agencies often respond, with each agency submitting a point in a fire database. Without careful filtering, the same fire can show up four times, thus inflating burned area estimates.

Hawbaker also indicated that it wasn’t unusual early on for people to use different coordinate systems, or to manually write coordinates down. “Just basic human error wouldn’t be surprising because they were probably more worried about protecting homes and lives and managing the fires than getting the coordinates exactly right,” he said.

Hawbaker’s group uses a machine learning approach called a Gradient Boosting Classifier for its BA algorithm. They are able to boost their predictive power by taking thousands of simple, individual classification trees trained with data from both areas that burned in fires and did not burn to boost their predictive power and identify pixels in any given Landsat image that may have been burned.

They also rely on a suite of predictor variables derived from Landsat ARD, such as the Normalized Burn Ratio (NBR), to identify burns in any given image. Pre-fire, healthy vegetation has very high near-infrared reflectance and low reflectance in the shortwave infrared portion of the spectrum. Recently burned areas on the other hand have relatively low reflectance in the near-infrared and high reflectance in the shortwave infrared band. A high NBR value generally indicates healthy vegetation while a low value indicates bare ground and recently burned areas.

ARD-based Burned Area Products coming soon

With the availability of ARD still in its relative infancy, Hawbaker anticipates the release of his group’s ARD-based Burned Area Products will occur sometime before the end of 2018. But he doesn’t have to wait until then to know the promise and the power of ARD for time series work.

For one thing, he likes the fact that overlapping scenes come with the downloaded Landsat ARD data. “Most people did not utilize the overlapping area among path/row scenes,” he said. “Because the ARD includes the overlapping scenes, they give us a much richer representation of the time series.”

He also likes the ARD tiling system that provides the data in 5,000-by-5,000 30-meter pixels. “Because they’re chunked out in these tiles that have a consistent number of rows and columns, it gets rid of the preprocessing that we used to do with the path/row data to make sure that all the images were the same size,” Hawbaker said.

When his agency developed its algorithm, it recognized that there were few efforts that tried to map fires in grassland ecosystems across the country. They also knew that there was a lot of fire in the Southeastern U.S., but also that vegetation grows back quickly there, and without more than just annual looks at Landsat scenes, “you’re probably going to miss a lot of those fires,” Hawbaker said.

“That’s where using the ARDs and the full time series really helps us as we try to look at each image,” he continued. “We analyze images with up to 80 percent cloud cover, and oftentimes the only look we’re getting at a burned area is through the gaps in the clouds. So, our data are probably still incomplete, but we found a lot more burned areas in places like the Great Plains and Southeast with ARD.”

Now that they’re working more with it, Hawbaker said his group has only a couple of tweaks that could help it in its work with ARD. For one, when the BA algorithm opens a full time series of an area—as many as 2,400 scenes collected between 1984 to 2017—it gets eight individual images for each scene representing each band of the spectrum, plus a Quality Assessment (QA) band.

Hawbaker said it would be nice if instead of delivering each band as an individual image file, they could all be combined internally into one multi-band image file. “When you have several dozen processes opening up all the images at once, we would run into operating system limits on the number of files we could have open,” he said. “So, when we preprocessed the ARD data, we would combine the individual bands. That’s the one thing we’d like, is to have them delivered as a multi-band image file.”

While his group also likes the addition of surface reflectance information to ARD, and the delivery of the QA band that adds details about what is cloud, cloud shadows, water, snow, and ice in the scenes, Hawbaker wonders if a second QA band might help decipher all that information better.

Landsat ARD feedback is welcomed

The QA band relies now on binary indicators that provide a true-or-false assessment as to whether artifacts in a scene are, say, clouds, and do so with different levels of confidence. Having a lot of different bit flags in that QA band “allows you to pack a lot of different information into, say, an eight-byte image … and takes out a lot of the guesswork,” Hawbaker said. “But it makes it a little bit more difficult to visualize as well.”

While he and his colleagues embrace the value of that QA band, for their purposes they would like to see something like a second band that would say “this is our suggestion for what we think are the best QA combinations,” Hawbaker said.

John Dwyer, the Science and Applications Branch Chief at EROS, said the USGS considered all conceivable aspects of ARD specification as it was first conceptualized and developed, including how the product bands are packaged and how pixel QA information is presented. So, recognizing that certain higher-level algorithms are likely to transform ARD for their specific use cases, USGS finalized delivery of individual band files and bit-packed QA as the most versatile, least common denominator option, he said.

Given that, Dwyer indicated that user feedback offered by Hawbaker and others is welcomed and valued as input to future versions of ARD. Since generating datasets to fit the variety of applications in the land science community is challenging at best, receiving user input from the likes of the Burned Area folks—or any other ARD-based results from scientists applying time series analyses—will be helpful in identifying which enhancements will serve the greatest good, Dwyer said.

For now, Hawbaker sees all of that as a minor sidebar in the larger conversation about how valuable ARD will be going forward. The reality in his office in Denver is that this bold new ARD innovation is going to make it easier for EROS to handle the production of their Burned Area Products.

“Maybe if you talk to enough people, you can figure out what really resonates across different groups, and what’s worth producing,” he said. “I just know that as the ARDs are delivered now, they’re really good. We’re going to learn so much more from time series analysis of Landsat data, and ARDs are critical to that.”

For more information about Landsat ARD or Burned Area Products:

https://www.mdpi.com/2072-4292/10/9/1363

https://landsat.usgs.gov/landsat-burned-area