You are here: Access Summer 2013 > Data Models to Support Integrated Research in the Arctic

Data Models to Support Integrated Research in the Arctic

By Dennis H. Walworth and John M. Pearce

Multiple science initiatives are aimed at understanding the state of biological and physical processes to inform critical public policy decisions in a changing Arctic. For example, as part of the USGS Changing Arctic Ecosystems initiative (Geiselman and others, 2011; Pearce and others, 2012; Oakley and others, 2012), scientists collect a variety of information to identify and understand the linkages between physical processes, ecosystems, and wildlife populations. Data are collected on temperature and precipitation, hydrological patterns, ice thaw dynamics, food characteristics (for example, vegetation growth, nutrient availability, invertebrate abundance), and wildlife population characteristics (for example, movement, foraging ecology, reproduction). Collectively, data from these studies will inform the current state of habitat conditions and will enable modeling and forecasting of how future changes may affect landscapes and wildlife populations.

Research studies conducted by the USGS within the Changing Arctic Ecosystems initiative occur across a vast northern landscape, from sea ice to continuous permafrost to boreal-arctic transition zones. Data are acquired within individual study areas (for example, terrestrial sites noted in Figure 1), and empirical data derived from these sites are being used to develop theoretical models that can then be tested across the broader landscape. Additionally, each study collects substantial information on the many possible climate-influenced drivers of wildlife population change.

Map portraying terrestrial study locations (solid black circles) for   the USGS Changing Arctic Ecosystems initiative across northern and west   central Alaska
Figure 1. Map portraying terrestrial study locations (solid black circles) for the USGS Changing Arctic Ecosystems initiative across northern and west central Alaska.

We sought a method to visualize the information collected by the various studies engaged in Arctic research. Our primary objective was to produce a graphical depiction of data to facilitate conversations among scientists and research managers about data acquisition, project design, and science products. A secondary objective was to develop a unifying methodology to describe the data being acquired at different study locations across the Arctic and to make this approach adaptable to new sources of information that may be collected in the future. Thus, direct comparisons could then be made among data collected at different times or across study areas, even when collected by different groups of scientists.

An example of a conceptual data model showing the relationship between two data entities.
Figure 2. An example of a conceptual data model showing the relationship between two data entities.

Needing a methodology to represent data in an easy to assimilate, preferably visual manner, we sought a solution from the field of information modeling. Information modeling, or data modeling, is a methodology that offers well-vetted approaches to documenting and communicating data in a visual format. Developed as a software engineering specialization in the 1970s, information modeling is used today throughout industry and government as a means to both understand and document data used by organizations and to design effective business and data management practices around that enhanced understanding.

Here, we demonstrate a specific type of data modeling called conceptual data modeling (CDM). This type of modeling addresses the business perspective of data in a high-level and generalized way. CDM is an abstraction of data, unconcerned with the specifics or physical forms of the data. Due to its simplicity, it is ideal for communicating ideas about data to a broad audience.

A CDM contains “entities,” which are the main subjects under study, such as “Bird,” “Vegetation,” “Forage,” and “Habitat.” Lines are drawn between entities to identify that two entities relate to one another in some general way (Figure 2). Relationships may be described to clarify the meaning of the relationship; for example, in Figure 2, “Vegetation” can be either “Forage” or “Habitat” to a “Bird.”

Our process of modeling data collected in the Changing Arctic Ecosystems initiative employed a two-staged approach to creating CDMs. First, we developed an enterprise or parent CDM, which modeled the overall initiative view of the data relevant to the science objectives (Figure 3).

Example of an enterprise or “parent” data model developed for the USGS Changing Arctic Ecosystems initiative.
Figure 3. Example of an enterprise or “parent” data model developed for the USGS Changing Arctic Ecosystems initiative.
The second level is a series of individual study models, each representing the data collected by a particular study (Figure 4). The study models include attribute groupings we call “data collections.” Data collections generally define a type of data collected, such as “Plant Biomass.” All study models were created to allow integration through the use of a managed common vocabulary. This standardization allows integration of study models across research studies, thus enabling easy and accurate comparison of like data and leveraging of data to serve multiple purposes, both present and future.

Example of a component data model for data collected for the USGS   Changing Arctic Ecosystems initiative to understand ungulate responses   to their environment.
Figure 4. Example of a component data model for data collected for the USGS Changing Arctic Ecosystems initiative to understand ungulate responses to their environment.

Through the application of information modeling techniques, we have created an integrated enterprise and study conceptual data model structure, documenting the data being collected and summarized by a wide range of diverse research in the Arctic. We have a common vocabulary for understanding our data across multiple studies and study sites as well as a template to document future studies to understand the implications of future data acquisitions when compared with existing data. Conceptual data modeling has been shown to be a simple method of visualizing data acquisition from a scientifically objective perspective. The models enable comparative analysis, a tool to facilitate conversation between researchers regarding best strategies in meeting research objectives and data integration, and a means to communicate study objectives and findings to land and resource management agencies and the public.

This article was originally submitted to the Community for Data Integration (CDI) for a conference presentation as an example of successes, challenges, and opportunities in data integration as applied to a research initiative and to open opportunities for potential collaboration with similar efforts through the CDI.

References

Geiselman, Joy, DeGange, Anthony, Oakley, Karen, Derksen, Dirk, and Whalen, Mary, 2011, Changing Arctic Ecosystems-Research to understand and project changes in marine and terrestrial ecosystems of the Arctic: U.S. Geological Survey Fact Sheet 2011-3136, 4 p. (Also available at http://pubs.er.usgs.gov/publication/fs20113136.)

Pearce, John, DeGange, Anthony, Flint, Paul, Fondell, Tom, Gustine, David, Holland-Bartels, Leslie, Hope, Andrew, Hupp, Jerry, Koch, Josh, Talbot, Sandra, Ward, David, and Whalen, Mary, 2012, Changing Arctic Ecosystems-Measuring and forecasting the response of Alaska's terrestrial ecosystem to a warming climate. U.S. Geological Survey Fact Sheet 2012-3144, 4 p. (Also available at http://pubs.er.usgs.gov/publication/fs20123144.)

Oakley, Karen, Whalen, Mary, Douglas, David, Udevitz, Mark, Atwood, Todd, and Jay, Chadwick, 2012, Changing Arctic ecosystems-Polar bear and walrus response to the rapid decline in Arctic sea ice. U.S. Geological Survey Fact Sheet 2012-3131, 4 p. (Also available at http://pubs.er.usgs.gov/publication/fs20123131.)