Denali, Tallgrass at EROS Launch New Era in High-Performance Computing Capabilities

Release Date:

One of the challenges in prototyping and evaluating research projects related to the U.S. Geological Survey’s (USGS) Land Change Monitoring, Assessment, and Projection (LCMAP) initiative has been what software engineers call “the last mile problem.”

Color image of Denali supercomputer

Denali is one of two new high-performance computing options that arrived at the USGS EROS Center in 2019, along with a second system called Tallgrass.

The LCMAP team at the Earth Resources Observation and Science (EROS) Center recently celebrated the release of its first suite of land cover change and landscape change products. But getting to that point required vast amounts of Landsat Analysis Ready Data (ARD) and mega compute power to enable the LCMAP algorithms to churn out the annual land cover and spectral change map products that go all the way back to 1985.

It in fact required the LCMAP team to lean heavily on the USGS’ Yeti supercomputer in Denver, CO, to evaluate those algorithms—a reality that posed its own challenges, especially since the available bandwidth needed to transfer all that data between South Dakota and Colorado was less than optimal, said Kelcy Smith, a software engineer working on the LCMAP project as a KBR contractor to the USGS.

Thus, the “last mile” conundrum.

“The (web) connection to Yeti inhibited any large-scale prototyping and evaluation of research,” Smith said. “Most research projects are concerned with relatively small amounts of data. LCMAP’s current interests lie with dense time series across the Contiguous United States (CONUS). That’s a lot of data to transfer when you are evaluating an algorithm.”

Bandwidth Connections No Longer a Problem

But now that problem seems to have been solved with the arrival of two new high-performance computing (HPC) options—Denali and Tallgrass—that came to EROS in 2019. Projects at the Center like LCMAP no longer have to worry about bandwidth connections because the super-computing horsepower is right there on the EROS campus.

Smith said he’s been using Denali since this past January. “LCMAP is currently moving most of its research and development work to Denali,” he said. “And I can foresee usage of it being incorporated in other aspects of LCMAP, such as application or assessment work.”

Denali was turned on before Christmas 2019 after its installation pushed the boundaries of EROS’ facilities infrastructure and accelerated the Center’s facilities plan for cooling and power upgrades. Once initial testing was completed on Denali, it became fully operational this past mid-January, said Jeff Falgout, a computer scientist and the technical lead with Science Analytics and Synthesis (SAS), Advanced Research Computing (ARC), within the USGS Core Science Systems (CSS) Mission Area in Denver, CO.

Now, just six years after CSS embarked down the HPC path and made Yeti its first supercomputer, ARC/SAS is funding, operating, maintaining, and supporting Yeti and this next generation of HPC that is Denali and Tallgrass, Falgout said. In that timespan, CSS has moved from Yeti’s 143 compute nodes to Denali’s more robust 232 nodes. Today, Denali is the flagship supercomputing system within the USGS, along with Tallgrass—a prototype system that not only allows people to explore different research and development possibilities, but also enables them to experiment as well with artificial intelligence and machine learning.

More Compute Power

Denali and Tallgrass at EROS are giving researchers across the USGS more access to greater compute power to do training runs, to take care of their infrastructure needs, to explore and develop artificial intelligence and machine learning workflows at large scale, to even dive into emergency response.

Denali has allowed Smith to use the LCMAP Information Warehouse and Data Store services to access all the available Landsat ARD for such things as time series algorithm approaches without having to keep a copy downloaded somewhere on a file system.

“These services allow access to all of our production results to continue future work,” he said.

LCMAP isn’t the only project at EROS interested in the new HPC capabilities. Members of Research Physical Scientist Terry Sohl’s FOREcasting SCEnarios (FORE-SCE) of land-use change project, which projects the influence of future land use on everything from water quality to local climate, say they are now using Denali as well.

The FORE-SCE team started on Yeti last fall when the old Land Carbon server it was using was discontinued. Steve Wika, a software engineer with KBR and a contractor to the USGS, said they were well aware when they went to Yeti that their use of it would be temporary.

HPC Use Outside of EROS

Outside of EROS, a number of agencies use Denali and Tallgrass as well, Falgout said. The USGS’ Cascades Volcano Observatory (CVO) in Vancouver, WA, has been employing it for work it’s doing on the East Rift Zone of Kilauea in the Hawaiian Islands, he said.

Color photo of Yeti high performance computer

Yeti was the first supercomputer, brought in six years ago for use within the USGS, by the Science Analytics and Synthesis, Advanced Research Computing group within the USGS Core Science Systems Mission Area in Denver, CO.

The University of Hawaii, in partnership with the Hawaiian Volcano Observatory, likewise is interested in volcanoes and sees Denali and Tallgrass as useful for assisting with its daily volcanic fog, or VOG, forecasts, Falgout said. Volcanic fog is a visible haze comprised of gas and an aerosol of tiny particles and acidic droplets created when sulfur dioxide and other gases emitted from a volcano chemically interact with sunlight and atmospheric oxygen, moisture, and dust.

“We’re setting up to run in tandem on a daily basis for them so that in case something happens out at the university, they have a second run they can use to do forecasting of VOG,” Falgout said.

There have been conversations about possibly doing some machine-learning techniques with HPC to estimate water temperature or lake temperature in places like the Great Lakes, he said. And discussions, too, about using the new HPC machines to do seismicity studies on Kilauea.

There’s certainly room for that kind of activity on Denali and Tallgrass, Falgout said. On a recent spring day, Denali was operating at about 38 percent of its capacity. People wanting to work in Denali maybe had to wait three seconds in line; for Tallgrass, perhaps a second, he said. For Yeti, which often runs at full capacity, the wait could be a day depending on a user’s needs.

Switch from Yeti to Denali, Tallgrass Goes Well

“There’s always ramp-up time as people start to get their stuff ready to go, do a little bit of porting of their code, getting familiar with the platform,” Falgout said. But the switch from Yeti to Denali “has actually gone a little bit quicker than we anticipated,” he added. “I figure by this summer, we’ll probably be running Denali full bore.”

Smith said he believes the plan for LCMAP is to use Denali long term, especially with its faster processing capabilities and thus the ability to evaluate current prototype algorithms in a more timely fashion. He does, however, worry about the ability to access Denali when needed.

“This is possibly my chief concern with Denali,” Smith said. “Yeti is currently a very busy system, and I expect it is only a matter of time until either people migrate to Denali from Yeti, or enough new users start using it that this will become a problem. We’ll cross this bridge if or when we come to it.”

Wika said waiting in line for HPC time isn’t as big a deal with the FORE-SCE model. When they are looking at study areas, they usually have the leeway to spend several months on that work, he said.

“I haven’t actually seen any waiting times yet for what we’re doing,” Wika said. “Maybe other people have, but most of the work that I’ve done (on HPCs) out there have usually started within 10 to 15 minutes.”

As of now, Yeti is scheduled to be taken offline early in 2022, and a new supercomputer then will take its place. Denali is a leased machine that will leave EROS in 2024 and be returned to its owner. Decisions on what will come on board after Denali and Tallgrass, and where they will be located, are still to be determined. But that’s four years away, Falgout said. For now, he and his team are concentrating on creating a competency with HPC within the USGS.

“HPC is just another tool in the toolbox,” he said. “So, what we’re trying to do is get people more comfortable with running large-scale simulations of data analytics and modeling. We want to make that a fabric within USGS so that it changes the culture within the Survey. We don’t want people having to deal with things on their own. We want to make sure everybody has the right tools, or at least good enough tools, to get their jobs done.”