CDI Risk Map Completed

By Community for Data Integration (CDI) September 6, 2018

The Community for Data Integration (CDI) Risk Map Project is developing modular tools and services to benefit a wide group of scientists and managers that deal with various aspects of risk research and planning. Risk is the potential that exposure to a hazard will lead to a negative consequence to an asset such as human or natural resources. This project builds upon a Department of the Interior project that is developing geospatial layers and other analytical results that visualize multi-hazard exposure to various DOI assets. The CDI Risk Map team has developed the following: a spatial database of hazards and assets, an API (application programming interface) to query the data, web services with Geoserver (an open-source geospatial server), and a modular map viewer and related infographics using the open source visualization framework TerriaJS.

Principal Investigator : Nathan J Wood

Co-Investigator : Jeanne M Jones, Kevin D Henry, Jason T Sherba, Peter Ng

1. Overview

The USGS Western Geographic Science Center (WGSC) was asked to help the USGS Natural Hazards Mission Area (NHMA) and the DOI Office of Emergency Management (OEM) to characterize and communicate multi-hazard exposure of DOI populations, resources, land, and infrastructure. Hazards of OEM interest include a wide range of human health, ecosystem health, wildland fire, geophysical, meteorological, technological, and adversarial threats. Assets of OEM interest are also broad and are organized by life safety, infrastructure, natural resources, cultural resources, economic resources, and emergency services.

To complement the DOI Risk project, WGSC collaborated with the USGS Community for Data Integration (CDI) on the “CDI Risk Map” project to develop a prototype web application and set of tools specific to the DOI risk assessment using data integration principles and standards such as shareable web services, application programming interfaces, modular and reusable code for the web viewer, and reusable graphics for visualizations. Within the USGS, these products and services would support the USGS CDI community in their efforts to develop transferable and modular tools, as well as support interest of USGS senior leadership in increasing the bureau’s capacity for integrative predictive science.

The collaborative nature of the OEM and CDI work involved various USGS science centers in multiple Mission Areas. This work resulted in mechanisms and structure for organizing and sharing knowledge, as well as identifying best practices in hazard characterization and classification, delineation of federal lands, identification of DOI assets, and data integration. This report summarizes the products developed and delivered for the CDI Risk Map project as well as the challenges and lessons learned.

2. Products of the CDI Risk Map Project

The CDI funding enabled the incorporation of data integration best practices into the prototype work for the DOI Risk project. At each step in the process of data acquisition, cleaning, analysis, and visualization, CDI goals of reusability and transferability were included to benefit the wider USGS community. The steps below describe the creation of the Risk Map products from data acquisition through web application deployment and summarize the technology stack supporting the project (Figure 1).

Figure 1. CDI Risk Map Development Infrastructure & Process Flow

Without CDI contributions to the USGS-OEM collaboration, products of the DOI Risk project would have been limited to traditional research reports that contained static maps of hazard exposure and a database of geospatial layers. There would have been limited ways for OEM partners to view geospatial products and no way for them to interact with results to create user-driven visualizations. Within the USGS, there would have been no effective way to share underlying data and no way to share lessons learned on data preparation, analysis, or visualization. CDI funding is contributing to the success of the USGS-OEM collaboration in multiple ways related to data acquisition, preparation, analysis, and visualization (Figure 2).

Figure 2. Table summarizing benefits to OEM-USGS collaboration due to CDI funding.

Many of the products described below, including links to code repositories, can be accessed from the DOI Risk page. This page is currently restricted to authenticated users. Contact cdi@usgs.gov with any further questions.

2.1 Pre-processing of asset data

For each asset dataset included in the DOI Risk assessment, we developed and applied a custom procedure to prepare the data for further processing and display within the CDI Risk Map. Custom processing included, when applicable, combining multiple input datasets into a comprehensive dataset and renaming, adding, removing, or combining layer attributes.

The following series of processing steps were then performed on all assets to prepare them for our exposure analysis:

Asset layer renamed following a naming convention and saved within a geodatabase
Asset layers clipped to DOI lands
Asset layers broken up into regional datasets and given a regional projection
State, County, DOI Region, Agency, and dataset restriction level added as attributes for each asset layer
Asset attribute tables cleaned to create uniform attribute names and values, remove un-needed fields, and set null values to NA.
QA/QC performed, including visual inspection, coordinate system check, and metadata

These preprocessing scripts are also available to the USGS community.

2.2 Pre-processing of hazard data

Preparation of hazard data ranged from simply adding an additional attribute to denote hazard level to building datasets from tabular input data and formatting for Risk Map use. We chose to represent hazards on an ordinal scale of 1 to 5, where 5 is the most severe. When hazard data was in a simple presence / absence format, such as landslides, all presence data was assigned the scale value of 5. Other hazard sources with a range of values were grouped using equal intervals into 5 categories.

2.3 Calculating asset exposure

Asset exposure was calculated for each asset layer by overlaying the asset layer with all hazard layers and attaching hazard attributes to each asset feature. We developed a script that accepts all combinations of Asset data (point, line, polygon) and Hazard data (raster, polygon) and calculates asset exposure for each asset feature. It was also necessary to develop a system for tracking the extent of each hazard and indicate in the hazard exposure analysis where hazard data did not exist (or was not collected) versus where there was no hazard present. Because of the built-in flexibility of this script in accepting assets and hazards in different formats, the exposure analysis scripts could be a useful tool for other hazard exposure modeling projects.

2.4 Boundary layers

Two boundary layers, PAD-US and DOI Regions, were used in the exposure analysis and included in the DOI Risk Map viewer as reference layers. National scale PAD-US data was downloaded from the National Gap Analysis Project (GAP) Protected Areas Data Portal (https://gapanalysis.usgs.gov/padus/data/download/) and was transformed in several ways. DOI land was first extracted from the entire PAD-US dataset while paying special attention to the owned versus managed PAD-US land type. Land owned or managed by the DOI was extracted and used to clip asset layers like Critical Habitats where DOI management is an important consideration. The addition of land managed by the DOI was also important for land related to tribes, since the Bureau of Indian Affairs assists with the administration of these lands but does not own tribal land. For other assets, only land owned by the DOI was used to clip the asset layer. We also found that it was necessary to further limit PAD-US data by a list of units in the PAD-US "Agg_Src" field provided by Daniel Wieferich. The PAD-US data was then flattened to remove overlapping polygons.

2.5 Heatmap generation

Asset heatmaps were created to solve two problems: (1) data sensitivity and (2) data summary and aggregation. By creating aggregations of raw datasets, these two issues can be solved, resulting in the useful visualization of data at a variety of scales and allowing certain sensitive datasets to be presented in a secure form. Heatmaps were developed by creating hexagonal grid templates at three different scales. The three scales are designed to provide useful information that updates depending on the current scale of the map. The analogous scales are: (1) nationwide, (2) state, and (3) county/metropolitan area. For each of these templates, assets that fall within a hexagon are aggregated and their attributes are assigned to the overlapping hexagon. In most cases, the hexagons represent a count of point features. However, for linear features, the total length of assets is summarized. Hexagons were preferred over squares, as they have improved accuracy when ‘binning’ features, still fit seamlessly within an evenly spaced grid, and appear less distorted when draped across the extent of the United States and territories. Hexagonal heatmaps are an improvement over traditional rectangular grids and web mapping applications have no trouble serving them. Generation of hexagon grids that cover the entire United States and territories was computationally intensive, but only needed to be performed once. The resulting grids can be leveraged for future USGS data visualization projects.

2.6 GeoServer and PostgreSQL

The development of the tech stack includes the back-end tiers of the CDI Risk Map web development infrastructure. This includes a PostgreSQL (with GIS extension) database server and GeoServer geo-spatial data server hosted on the CHS AWS Cloud. We developed a collection of program scripts to automate tasks for loading and creating DOI Risk data into PostgreSQL and GeoServer, respectively. Since these tasks were performed frequently and repetitively throughout the development life cycle, the scripts became an indispensable resource for managing the complexities and time spent performing such tasks. Early on, we had a recurring issue with the GeoServer inexplicably crashing, which raised concerns over its stability and reliability. We found and integrated solutions into an AWS CloudFormation script, which can serve as a template for others to use for automating the setup and configuration of a GeoServer on the CHS AWS Cloud. All program scripts are in the DOI Risk Project code repository.

2.7 TerriaJS DOI Risk Map configuration

As discussed earlier, the IGEMS web application currently used by the DOI to visualize hazards information proved to be inadequate for storage and delivery of the geospatial information and visualizations related to hazard exposure of DOI assets. Therefore, the WGSC team sought out a solution that would better serve OEM needs while also being flexible for use by other USGS research. The team decided to explore the use of TerriaJS, which is an open-source, standards-based framework for web portal development. TerriaJS is seeing increasing usage within the USGS.

For the CDI Risk project, a range of TerriaJS application configuration settings were altered to create a DOI Risk specific version of the application (Figure 3). This included adding the USGS logo, altering basemap thumbnails, centering the map viewer on the United States, and updating viewer colors. Many of these changes were incorporated from work by USGS research oceanographer Richard Signell in Woods Hole, MA, and others to create a USGS-relevant version of the TerriaJS application. While many changes were easily made within the TerriaJS configuration files, we found that in some cases altering the TerriaJS base-code was necessary and required significant time and effort.

Figure 3. A screen shot of the TerriaJS application that was created to support the DOI Risk project.

2.8 Development of the Risk Map data catalog

The Risk Map data catalog includes asset, hazard, and boundary layers developed for the DOI Risk Project. The structure of the catalog and the appearance of the layers is defined within a TerriaJS initialization file and layer symbology files. Layer groupings defined within the initialization file create the layout

Asset datasets within the Risk Map are further grouped to show a heatmap at broad scales and individual asset features with exposure attributes when zoomed to a fine enough scale (Figure 4). These layer groupings are defined within GeoServer and the layer symbology.

Figure 4. Grouped DOI Employees layer at three progressively finer scales.

Layer symbology files for each layer within the Risk Map were created in Styled Layer Descriptor (SLD) format using the python-sld package. The SLD files define the layer styling when combined with a layer on GeoServer. The SLD files also define the legend layout and scale-based layer visibility for grouped layers. Python scripts were developed for styling raster and vector data based on break values defined in a table, along with other layer attributes. These scripts could be easily adapted for other projects where many layers need to be symbolized in GeoServer.

2.9 DOI Risk geodatabase API development

The DOI Risk REST API is a web service hosted on the CHS AWS Cloud that provides an interface to the extensive collection of DOI Risk related data. The API allows requests for DOI Risk data to be made from a web browser or programmatically from applications such as the TerriaJS-based CDI Risk web application (Figure 5). A benefit of the REST API is that it controls the content that can be requested and delivered to users and keeps the physical location of the data hidden. The API can be programmatically called within tools and applications using any programming language.

Figure 5. DOI Risk application programming interface (API)

The API delivered at the close of the CDI funding lays the foundation for future expansion. Currently, only asset data is stored in the database and all queries reference this data only. Hazard data will be added as the project expands. Initial proposal plans included a simple browser-based query tool to examine the data. With the inclusion of graphs and charts in the TerriaJS application and custom calls added to the API to tailor data for TerriaJS consumption, this additional query tool was deemed unnecessary.

2.10 TerriaJS code contribution

TerriaJS has been identified as a useful tool for USGS research, and momentum within CDI has been generated towards customizing the software for USGS use. Adoption of TerriaJS by the DOI Risk Project as a data catalog and map viewer, has led to several opportunities to extend its functionality. When reviewing the software, our group identified that a major weakness of TerriaJS was its charting capabilities. This area was targeted as an opportunity to improve TerriaJS’s functionality and create useful tools for the CDI Risk Map project.

TerriaJS’s existing charting functionality was limited to time-series line charts. To better serve our project and other USGS work, we added functionality to create categorical bar and pie charts. An example use case for the bar charts is to display the amount of DOI Employees exposed to each type of hazard (Figure 6).

Figure 6. New TerriaJS bar chart functionality displaying DOI Employee exposure

Pie charts can be useful when delving deeper into the heatmap information, to display which proportion of assets fall within different DOI bureau jurisdictions (Figure 7).

Figure 7. New TerriaJS pie chart functionality displaying DOI asset breakdown by Bureau jurisdiction

This functionality was integrated with the asset exposure analysis API to provide a front-end for retrieving this data. TerriaJS code contributions are hosted on an internal USGS code repository and can easily be added to existing TerriaJS instances to use the new charting functionality by following the instructions here: https://docs.terria.io/guide/contributing/development-environment/.

2.11 Risk Map deployment

Cloud formation scripts were developed to automate the deployment of the Risk Map application on the USGS CHS AWS environment. This includes the installation of dependencies, the Risk Map application, and a proxy server to cache map tiles. For USGS researchers interested in deploying TerriaJS, the automated deployment scripts and server infrastructure developed for this project could serve as a useful example.

2.12 Risk Map potential within USGS

Features of the Risk Map have the potential to add value for the USGS user community. Built into the TerriaJS geospatial catalog is the ability to add additional data to the viewer and display it along with the DOI Risk assets, hazards, and boundaries. For example, local data (e.g. local habitat data or classified data) can be added to the web application to do a local visualization (Figure 8). Operational data in the form of web services can be added to the web application to support preparedness or response efforts (Figure 9). In addition, researchers could use Risk Map web services to do their own analyses or generate new products (Figure 10). These features show the potential of the types of integrated predictive science that result from shareable, reusable products and services.

Figure 8. Example of adding NWS watches and warnings with DOI employees to understand potential life safety issues.

Figure 9. Prototype example of DOI-based ShakeCast product using DOI Risk buildings web service.

3. Summary

We found TerriaJS to be useful for providing access to a large catalog of data and for the basic display of this data. The user can choose which layers to view, keeping the left panel uncluttered and more informative. We found that the extension of TerriaJS functionality is possible, but it can only go so far, and more complex analysis and visualization without sacrificing usability may require a custom web viewer application.

Our work with GeoServer showed it to be a viable open-source alternative for serving geospatial web services but deploying a stable version in the cloud required significant effort. We hope that our custom deployment will become a template for others in the USGS.

We hope that our PostgreSQL database and API will be useful to the larger USGS and DOI community as a resource for accessing the DOI Risk data in a human-readable format. The database currently contains only our asset-exposure vector data and therefore the API can only respond to queries on part of the DOI Risk data. Our hazard heatmaps are served as rasters and stored on the same server as GeoServer. Eventually we plan to add the original hazard data to the database in vector form and expand the API queries to this data as well.

Source: USGS Sciencebase (id: 5b91a0c2e4b0702d0e808bb2)

CDI Risk Map Completed

1. Overview

2. Products of the CDI Risk Map Project

2.1 Pre-processing of asset data

2.2 Pre-processing of hazard data

2.3 Calculating asset exposure

2.4 Boundary layers

2.5 Heatmap generation

2.6 GeoServer and PostgreSQL

2.7 TerriaJS DOI Risk Map configuration

2.8 Development of the Risk Map data catalog

2.9 DOI Risk geodatabase API development

2.10 TerriaJS code contribution

2.11 Risk Map deployment

2.12 Risk Map potential within USGS

Supervisory Research Geographer

Supervisory Research Geographer

Geographer

Geographer

Geographer

Computer Scientist

1. Overview

2. Products of the CDI Risk Map Project

2.1 Pre-processing of asset data

2.2 Pre-processing of hazard data

2.3 Calculating asset exposure

2.4 Boundary layers

2.5 Heatmap generation

2.6 GeoServer and PostgreSQL

2.7 TerriaJS DOI Risk Map configuration

2.8 Development of the Risk Map data catalog

2.9 DOI Risk geodatabase API development

2.10 TerriaJS code contribution

2.11 Risk Map deployment

2.12 Risk Map potential within USGS

Supervisory Research Geographer

Supervisory Research Geographer

Geographer

Geographer

Geographer

Computer Scientist