The Analyze stage of the Science Data Lifecycle represents activities associated with the exploration and assessment of data, where hypotheses are tested, discoveries are made, and conclusions are drawn.
Data Analysis' Place in the Data Lifecycle
Frequently the exploration and interpretation of data at the Analysis stage of the data lifecycle reveals additional data acquisitions or processing needed to meet the goals of a research project.
Process and Analyze - Closely related activities
It can sometimes be difficult to determine where processing ends and analysis begins. In part this is because the two concepts are often intermingled to ensure that both data and research products meet a common set of goals.Learn more
Table of Contents
Data analysis may be required to better understand data content, context, and quality. In this stage of the Lifecycle, conclusions or new datasets are generated and methods are documented. Analytical activities include the following:
Statistical methods are applied to data to derive patterns, make generalizations, detect trends, and to estimate the uncertainty associated with the data. Many methods appropriate to work at the USGS, both within and beyond hydrology, can be found in the classic 2002 reference book by Helsel and Hirsch, "Statistical Methods in Water Resources."
Graphical representations of events or scientific data enhance our understanding. These may be static images that show a pattern or clarify the connections between elements in a complex system; animated graphics and videos; or maps that show data in different ways.
- Visualizing Avian Influenza
- Drought in the Colorado River Basin
- Map of White-nose syndrome in bats and time series map
- National Climate Change Viewer (NCCV): A web-based visualization of large climate data.
- Map of latest earthquakes around the world:
- Visualization details of the Virginia, USA earthquake of 2011
- National Land Cover Database Evaluation, Visualization, and Analysis Tool
According to the Esri GIS Dictionary, spatial analysis is "the process of examining the locations, attributes, and relationships of features in spatial data through overlay and other analytical techniques in order to address a question or gain useful knowledge. Spatial analysis extracts or creates new information from spatial data." A good resource for understanding spatial analysis can be found in the USGS report, "A Practical Primer on Geostatistics."
Analytic methods for detection of objects and patterns within images are used to identify features and derive time-based information that is difficult or impossible to obtain in other ways.
- USGS Earth Resources Observation and Science (EROS) Center is a leader in processing earth imagery data:
- Remote Sensing Technologies - a resource for various techniques and methods
- Land Change Monitoring, Assessment, and Projection (LCMAP) system analyzes the Landsat data archive to detect and characterize land cover change.
- Classifying individual tree species using airborne lidar
- iCoast: Citizen-science assisted evaluation of hurricane damage.
- Barrier Island Shorelines Extracted from Landsat Imagery
- Assessment of landslide hazards from the April 25, 2015 Gorkha, Nepal earthquake sequence
- Analysis of video surveillance used for detecting species near solar project power towers
Models are tools (usually software) for abstraction and simplification of natural systems that allow us to describe and explore those systems and make predictions about system behavior.
- Landcover modeling
- Modeling Hydrologic Resources:
- COAWST: Coupled-Ocean-Atmosphere-Wave-Sediment Transport Modeling System to explore how coastal ocean and atmospheric processes may impact coastal change.
- Earthquake modeling:
- Fault rupture time series modeling
- ShakeMap model describing shaking during an earthquake
- Incorporating induced seismicity in the 2014 United States National Seismic Hazard Model
- Ecosystem and habitat models:
- National Gap Analysis Program (GAP): species data and modeling
- Evaluating and Ranking Threats to the Long-Term Persistence of Polar Bears: a polar bear ecosystem model (Bayesian network)
- Modeling Desert Tortoise Habitat
- Habitat Suitability Index Models
- Avian habitat modeling
- Modeling suitable habitat of invasive red lionfish
- Modeling Ecological Flow Regime
Interpretation is the act of using data and analytic output to evaluate hypotheses and methods, extrapolate from observations to predictions, detect patterns, and explore the consequences of assumptions. The term 'interpretive' has special meaning within the USGS; see the USGS Fundamental Science Practices site for an explanation and examples of interpretive and noninterpretive data products.
Reproducible science is a foundation of our scientific work. Sufficient documentation of analysis methods in support of reproducibility is required. Documentation of analytic methods and techniques is usually included in published works such as methods papers, research publications, and journal articles. Search the USGS Publications Warehouse for Analysis examples covering topics of specific interest.
What the U.S. Geological Survey Manual Requires:
Policies that apply to the Analyze stage largely deal with making sure there is appropriate documentation of the tools and methods used in analysis, and that the analyses directly result from the data acquired and processed.
The USGS Manual Chapter 500.25 - USGS Scientific Integrity discusses the USGS’s dedication to “preserving the integrity of the scientific activities it conducts and that are conducted on its behalf” by adhering to Department of Interior 305 DM 3 - Integrity of Scientific and Scholarly Activities.
The USGS Manual Chapter 502.2 - Fundamental Science Practices: Planning and Conducting Data Collection and Research includes requirements for process documentation of analytical methods and techniques.
"Documentation: Data collected for publication in databases or information products, regardless of the manner in which they are published (such as USGS reports, journal articles, and Web pages), must be documented to describe the methods or techniques used to collect, process, and analyze data (including computer modeling software and tools produced by USGS); the structure of the output; description of accuracy and precision; standards for metadata; and methods of quality assurance."
The USGS Manual Chapter 502.4 - Fundamental Science Practices: Review, Approval, and Release of Information Products addresses documentation of the methodology used to create data and generate research results.
"Methods used to collect data and produce results must be defensible and adequately documented."