Skip to main content
U.S. flag

An official website of the United States government

502.6 - Fundamental Science Practices: Scientific Data Management Lifecycle

This chapter establishes the Bureau’s overarching scientific data management requirements on the basis of a data lifecycle model.  The model offers a high-level view of USGS data collection, data handling, and data dissemination activities.

Issuance Number:        SM 502.6

Subject:                          Fundamental Science Practices: Scientific Data Management Lifecycle

Expiration Date:           Retain until superseded or cancelled.

Responsible Office:      Office of Science Quality and Integrity

Instruction:                   This Survey Manual (SM) chapter supersedes SM 502.6, Fundamental Science Practices: Scientific Data Management, dated January 13, 2017.

Approving Official:      /s/ William Cunningham

                                         Director, Office of Science Quality and Integrity

1.      Purpose and Scope.

A.  U.S. Geological Survey (USGS) scientific data (also referred to as “data”), as defined in SM 502.8, represent assets with potential value beyond any immediate research use and therefore must be accounted for and properly managed throughout their lifecycle. This SM chapter describes the Bureau’s overarching scientific data management requirements using the USGS Science Data Lifecycle Model (Data Lifecycle). The Data Management website provides guidance on required USGS data collection, data handling, and data dissemination activities in support of the Data Lifecycle. The purpose of the Data Lifecycle is to ensure that scientific data are discoverable, well described, and preserved for access and reuse. The Data Lifecycle also serves as a framework to help evaluate and improve the requirements and practices for managing USGS scientific data and to identify areas in which new tools and standards are needed.

B.  USGS scientific data encompass a wide variety of information, such as textual and numeric information, instrument outputs, statistics, images (fixed or moving), diagrams, maps, audio or video recordings, preliminary (also referred to as “provisional”) and dynamic data, processed or derived data, model inputs and final outputs, simulations, observations of natural phenomena at explicit times and locations, and data stored on any type of media released as information products or through approved online databases or Web services (refer to SM 502.8 and SM 502.5).

C.  Fundamental Science Practices (FSP) requirements detailed in this policy apply to all USGS employees, political appointees, volunteers, including emeriti, as well as contractors, cooperators, partners, and other external parties who assist with USGS scientific data management activities. Scientific integrity requirements ensure the free flow of scientific information as detailed in 305 DM 3. Failing to comply with FSP can constitute a loss of USGS and Department scientific integrity.

2.      References.

A.  Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure (Executive Orders 12906, April 13, 1994 and 13286, March 5, 2003)

B.  Ensuring Free, Immediate, and Equitable Access to Federally Funded Research (Office of Science Technology and Policy (OSTP) memorandum, August 25, 2022)

C.  Foundations for Evidence-Based Policymaking Act of 2018 (Public Law 115–435, January 14, 2019)

D.  Geospatial Data Act of 2018 (43 U.S. Code Chapter 46 – Geospatial Data, October 5, 2018)

E.  Increasing Access to the Results of Federally Funded Scientific Research (Office of Science and Technology Policy (OSTP) memorandum, February 22, 2013)

F.  Open Data Policy—Managing Information as an Asset (Office of Management and Budget (OMB) memorandum, May 9, 2013)

G.  Phase 2 Implementation of the Foundations for Evidence-Based Policymaking Act of 2018: Open Government Data Access and Management Guidance (Office of Management and Budget (OMB) memorandum, January 15, 2025)

H.  Restoring Gold Standard Science (Executive Order 14303, May 23, 2025)

I.  SM Part 502 - USGS Fundamental Science Practices and Related Policies 

J.  SM Part 1100 - USGS Publishing Policies

K.  SM 205.18, Authority to Approve Information Products

L.  SM 431.1, Records Management Roles and Responsibilities

M.  SM 500.25, Scientific Integrity

N.  SM 601.1, USGS Web Standards

O.  USGS Records Disposition Schedules

P.  USGS Data Management

Q.  USGS Public Access Plan

R.  USGS Science Data Lifecycle Model

3.      Policy.

A.  USGS scientific data must be managed throughout the Data Lifecycle as described in this section, and when approved for release, these data must be made available to the public in accordance with USGS FSP requirements described in SM 502.8, unless access must be restricted because of security, privacy, confidentiality, or other constraints (refer to SM 502.5).

B.  Additional requirements regarding review, approval, and release of USGS scientific data and information products are found in this and other FSP policies and related directives. Guidance that supports this policy and other FSP related requirements are available on the USGS Data Management website.

C.  The requirements for each step of the Data Lifecycle that must be followed are:

(1)  Plan. A data management plan is required. Planning for USGS data collection and research activities (such as project work plans described in SM 502.2), proposals, and funding agreements for projects funded or managed by the USGS must include a data management plan prior to initiation of the work. The purpose of a data management plan is to document how project staff intend to meet USGS and Federal policies related to data and information management. The data management plan is a living document—it must be updated as a reference for actual data collection and management activities performed throughout the Data Lifecycle. The content of a data management plan may be developed as a stand-alone document or integrated into broader research planning documentation. The plan must include intended standards, actions, and responsible parties as appropriate for acquiring, processing, analyzing, preserving, publishing/sharing, describing, managing the quality of, and securing the data holdings, as well as tools and other considerations for performing these activities with or without collaborators and partners (https://www.usgs.gov/data-management/planning). Data acquisition agreements with collaborators, cooperators, or vendors must be created to clarify data sharing, release, and preservation responsibilities (https://www.usgs.gov/data-management/acquire). USGS scientists and authors must identify and include in the data management plan an acceptable digital repository for data publishing/sharing that complies with FSP requirements, and considers related software release planning, if applicable, and any records disposition requirements.

(2)  Acquire. Techniques for acquiring scientific data must be documented to ensure that USGS scientific findings are verifiable and to maintain the provenance and integrity of the data. Data acquired from collaborators, cooperators, or vendors should adhere to agreements regarding data sharing, release, and preservation responsibilities (https://www.usgs.gov/data-management/acquire). Data acquisition encompasses collecting or generating new data and/or adding to existing data holdings and may include data purchased or otherwise acquired for use in a USGS data product.

(3)  Process. Data processing steps must be documented to ensure the utility, quality, and integrity of the data and the ability to reproduce final released data from the original data (SM 502.2). Data processing denotes those actions or steps taken to verify, organize, transform, repair, integrate, or convert data to appropriate formats for subsequent use (https://www.usgs.gov/data-management/process).

(4)  Analyze. All data analyses must be documented or cited to ensure scientific quality and integrity and create a foundation for future research. Data analyses using methods or techniques not previously published must be documented and publicly available, as described in SM 502.2. Data analyses using published methods must cite those published methods as appropriate. Analyses involve actions taken to interpret data, detect patterns, develop explanations, and test hypotheses (https://www.usgs.gov/data-management/analyze).

(5)  Preserve. USGS must preserve, in a USGS managed location, authoritative copies of data and associated metadata collected by or generated on behalf of the Bureau. Preservation includes procedures taken to ensure data integrity and persistence (https://www.usgs.gov/data-management/preserve). Deposit of data in USGS-owned acceptable repositories ensures the preservation requirements are met; however, in circumstances where data are published in acceptable non-USGS repositories or cannot be made public due to access restrictions (as described in SM 502.5), data and metadata must also be preserved on a USGS managed infrastructure. The appropriate USGS records disposition schedules and other applicable records management requirements described in SM 431.1.

(5)  Publish/Share. USGS scientific data used to support scholarly conclusions in USGS authored publications must be electronically released free to the public prior to, or concurrent with, the publication referencing these data (refer to USGS Public Access Plan). Scientific data not associated with scholarly publications must also be released in a timely manner. The release of data may occur at the end of the project, end of funding, or with some planned periodicity, unless access restrictions apply. Review and approval requirements for releasing data, including preliminary data, are described in SM 502.8. Controls must be in place to protect proprietary and predecisional data (SM 502.5) and the scientific integrity of the data (refer to Departmental Manual (DM) chapter 305 DM 3 and SM 500.25). For more information on data sharing and release, refer to https://www.usgs.gov/data-management/publishshare.

D.  The following cross-cutting elements must be performed continually across all stages of the Data Lifecycle.

(1)  Describe (Metadata, Documentation). Standardized metadata are required in order for USGS data to be approved for release and must be included in the USGS Science Data Catalog (refer to SM 502.7). Metadata enable reuse and reproducibility of research results. Refer to SM 502.8 for information about metadata for databases. Additional documentation about the data, beyond the metadata, may be necessary to specify the context of the data’s use in specific systems, applications, and settings and may include ancillary materials such as field notes (https://www.usgs.gov/data-management/describe-metadatadocumentation).

(2)  Manage Quality. Data quality assurance checks must be performed and documented throughout the Data Lifecycle. Data management activities must be completed in a consistent, objective, documented, and replicable manner to help ensure that quality objectives and verifiable results are achieved (refer to SM 502.2 and https://www.usgs.gov/data-management/manage-quality).

(3)  Backup and Secure. During all data management processes, backup copies of data must be maintained to allow recovery from loss due to human errors, hardware failures, computer viruses, power failures, or natural disasters. Best practices for backing up and securing data at all stages of the Data Lifecycle are available (https://www.usgs.gov/data-management/backup-secure).

4.      Responsibilities.

A.  Associate Directors and Regional Directors. Associate Directors (ADs) and Regional Directors (RDs), as members of the USGS Executive Leadership Team (ELT), enforce policy for how scientific investigations, research, and related activities are carried out, as well as how data and information products are reviewed and approved. The ADs and RDs provide oversight and support for the data management activities in their respective mission areas and regions. They collaborate with each other to address issues or take corrective action regarding scientific data management and Data Lifecycle activities.

B.  Office of Science Quality and Integrity, Core Science Systems, FSP Advisory Council, Chief Data Officer, and Office of the Associate Chief Information Officer. The Office of Science Quality and Integrity (OSQI), Core Science Systems (CSS), and FSP Advisory Council (FSPAC) are responsible for jointly developing USGS data management policy and collaborating on the development of related guidance and procedures. The OSQI coordinates and communicates with the ADs, RDs, and the entire ELT as needed to address and resolve issues regarding implementation of the Data Lifecycle. The OSQI is also responsible for maintaining this (SM 502.6) and other FSP policy chapters. CSS develops and maintains comprehensive guidance and procedures on the USGS Data Management website. FSPAC develops and maintains comprehensive guidance and procedures on the FSP website. The Chief Data Officer communicates and supports the development, implementation, and maintenance of USGS Bureau-wide data management strategies, best practices, and tools to meet federal and DOI requirements. The Office of the Associate Chief Information Officer (ACIO) is responsible for the records management program that informs data records requirements.

C.  Science Center and Division Directors. Science Center (or equivalent) Directors (SCDs) have Bureau approval authority for scientific data and can reassign this authority to a designee (SM 205.18). SCDs ensure compliance with data management lifecycle requirements for scientific data produced in their Centers or offices. SCDs must assign Data Managers to steward their Center’s Data Lifecycle activities.

D.  USGS Scientists and Authors. USGS scientists, authors and affiliates who manage the collection, generation or acquisition of data are responsible for ensuring data are managed according to the Data Lifecycle. Scientists and authors are also responsible for ensuring that those who contribute to USGS Data Lifecycle activities are aware of and comply with USGS data management policies.

E.  Data Managers. Data managers are the assigned or designated individuals or teams responsible for stewarding scientific data through the Data Lifecycle activities. They collaborate with their SCDs, supervisors, records liaisons, and scientists and authors to conduct data stewardship activities and communicate USGS data management best practices.

F.  Records Management Officer. The USGS Records Management Officer, in collaboration with repository staff and data managers, ensures that Bureau-wide policies, standards, and procedures are in place to provide guidance on creating accurate and complete records and maintaining them throughout the science data lifecycle.

 

Was this page helpful?