Skip to main content
U.S. flag

An official website of the United States government

502.6 - Fundamental Science Practices: Scientific Data Management

This chapter establishes the Bureau’s overarching scientific data management requirements on the basis of a data lifecycle model.  The model offers a high-level view of USGS data collection, data handling, and data dissemination activities.

Date: 01/13/2017

Responsible Office: Office of Science Quality and Integrity

Instruction: This is a new Survey Manual (SM) chapter that replaces Instructional Memorandum (IM) OSQI-2015-01, issued February 19, 2015.

1.  Purpose and Scope.

A.  U.S. Geological Survey (USGS) data represent corporate assets with potential value beyond any immediate research use and therefore need to be accounted for and properly managed throughout their lifecycle.  This chapter establishes the Bureau’s overarching scientific data management requirements on the basis of a data lifecycle model.  The model offers a high-level view of USGS data collection, data handling, and data dissemination activities.  By applying the elements in the lifecycle model, USGS scientists can ensure that data are discoverable, well described, and preserved for access and use beyond the life of research projects.  The data lifecycle model also serves as a structure to help evaluate and improve the requirements and practices for managing USGS scientific data when necessary, and to identify areas in which new tools and standards are needed.  Additional SM chapters that focus on specific elements of the data lifecycle will also be available.

B.  USGS scientific data encompass a wide variety of information including textual and numeric information, instrument readouts, statistics, images (fixed or moving), diagrams, maps, and audio or video recordings.  They include raw or processed, published, and archived data, such as data generated by experiments, models, simulations, observations of natural phenomena at explicit times and locations, and by data stored on any type of media.

2.  References.

A. Open Data Policy—Managing Data as an Asset (Office of Management and Budget (OMB) memorandum, May 9, 2013) 

B. Increasing Access to the Results of Federally Funded Scientific Research, Office of Science and Technology Policy (OSTP) memorandum, February 22, 2013) 

C. Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure (Executive Orders 12906 and 13286) 

D. 305 DM 3 – Integrity of Scientific and Scholarly Activities 

E. 378 DM 1 – Data Resource Management 

F. SM 431.1 – Records Management Program 

G. SM 431.11 – Litigation Holds 

H. SM 502.1 – Fundamental Science Practices: Foundation Policy 

I. SM 502.2 – Fundamental Science Practices: Planning and Conducting Data Collection and Research 

J. SM 502.4 – Fundamental Science Practices: Review, Approval, and Release of Information Products 

K. SM 502.5 – Fundamental Science Practices: Safeguarding Unpublished USGS Data, Information, and Associated Scientific Materials 

L. SM 502.7 – Fundamental Science Practices: Metadata for Scientific including Data 

M. SM 502.8 – Fundamental Science Practices: Review and Approval of Scientific Data for Release 

N. SM 502.9 – Fundamental Science Practices: Preservation Requirements for Scientific Data 

O. SM 600.5 – Information Technology Systems Security - General Requirements 

P. SM 601.1 – USGS Web Standards 

Q. SM 1100.1 – Information Product Planning 

R. SM 1100.3 – USGS Publication Series 

S. SM 1100.4 – Use of Outside Publications, Including Abstracts 

T. USGS Records Disposition Schedules 

U. USGS Data Management website 

V. USGS Fundamental Science Practices website 

W. USGS Public Access Plan

3.  Policy. 

A.  USGS scientific data must be managed throughout the data lifecycle as described in section 4, and when approved for release, these data must be made available to the public in accordance with USGS Fundamental Science Practices (FSP) requirements  (SM 502.8).  Additional requirements such as those regarding review, approval, and release of all USGS science data and information are found in this and other underlying and related FSP policies.

B.  Guidance and procedures that support this chapter and other FSP related requirements are available on the USGS Data Management website and the FSP website.

4.  Elements of the USGS Data Lifecycle Model.  The elements described below represent an overview of results to be achieved.  Refer to the Data Management website for more detailed information.

A.  Plan.  The project work plan (SM 502.2) for every research project funded or managed by the USGS must include a data management plan prior to initiation of the project.  A data management plan will include standards and intended actions as appropriate to the project for acquiring, processing, analyzing, preserving, publishing/sharing, describing, and managing the quality of, backing up, and securing the data holdings (see the Data Management website.  The data management plan is a living document—it should be updated as needed to reflect the reality of the scope of work and it serves as a record of the data management activities throughout the lifecycle of the project.

B.  Acquire.  Data acquisition encompasses collecting new data or adding to existing data holdings and may include data purchased or otherwise acquired for use in a USGS data product.  Methods and techniques for acquiring research data must be planned and documented to ensure that USGS scientific findings are verifiable.  Agreements with collaborators/partners and vendors must be created to clarify data ownership responsibilities (see https://www.usgs.gov/data-management/acquire).

C.  Process.  Data processing denotes those actions or steps taken to verify, organize, transform, repair, integrate, and convert data to appropriate formats for subsequent use (see https://www.usgs.gov/data-management/process).  Processing methods and steps must be documented to ensure the utility, quality, and integrity of the data and the ability to reproduce final released data from the original raw data (SM 502.2).

D.  Analyze.  Analyses involve actions taken to interpret data, detect patterns, develop explanations, and test hypotheses (see https://www.usgs.gov/data-management/analyze).  Data analyses must be documented as described in SM 502.2.

E.  Preserve.  Preservation includes actions and steps taken to ensure that data are retained and accessible consistent with the appropriate USGS Records Disposition Schedules and other applicable requirements (SM 431.1).  Archiving or transfers to an appropriate data repository or the National Archives and Records Administration (NARA) are integral aspects of data preservation (see https://www.usgs.gov/data-management/preserve).  Controls must be in place to protect proprietary and predecisional data (SM 502.5) and the scientific integrity of the data (refer to Departmental Manual (DM) chapter 305 DM 3 and SM 500.25).  USGS data approved for release must be preserved as described in SM 502.9.

F.  Publish/Share.  USGS scientific data used to support scholarly conclusions in USGS authored or funded publications must be released free to the public consistent with the USGS Public Access Plan, that is, before or commensurate with the publication using these data.  Preliminary data may be released when appropriate.  Final project data approved for release must be made available free-of-charge at the end of the project unless the agency determines that a demonstrated circumstance requires the data not be made publicly available; for example, in cases where access must be restricted because of security, privacy, confidentiality, or other constraints.  Review and approval requirements for releasing USGS data are described in SM 502.8.  For information on data sharing, see https://www.usgs.gov/data-management/publishshare.

G.  Describe (Metadata, Documentation).  Metadata must accompany USGS scientific data to enable reuse and reproducibility of research results.  Standardized metadata are required in order for USGS data to be approved for release as described in SM 502.7.  Documentation, in addition to the metadata, may be required and should provide information about the data in the context of their use in specific systems, applications, and settings and may include ancillary materials such as field notes (see https://www.usgs.gov/data-management/describe-metadatadocumentation).

H.  Manage Quality.  Data management activities (including use of standard methods and best practice techniques) must be done in a consistent, objective, documented, and replicable manner to help ensure that high-quality and verifiable results are achieved (refer to SM 502.2).  Quality assurance checks must be made throughout the science data lifecycle (https://www2.usgs.gov/datamanagement/qaqc.php). 

I.  Backup and Secure.  During all data management processes, backup copies of data must be maintained to allow recovery from loss due to human error, hardware failure, computer viruses, power failure, or natural disaster.  Best practices for backup and securing data at all stages of the data lifecycle are available at https://www.usgs.gov/data-management/backup-secure

5.  Responsibilities.  Compliance with this policy is incumbent on all USGS employees involved in scientific data lifecycle-related activities listed in section 4 above.  Designated officials have specific roles in establishing and enforcing the policies and other requirements that underpin data management:

A.  Associate Directors and Regional Directors.  Associate Directors (ADs) and Regional Directors (RDs), as members of the USGS Executive Leadership Team (ELT), set policy for how scientific investigations, research, and related activities are carried out, as well as how data and information products are reviewed and approved for release and dissemination.  The ADs and RDs provide oversight and support for the data management activities in their respective mission and regional areas.  They collaborate with each other to address issues or take corrective action with regard to the scientific data-management lifecycle activities.

B.  Office of Science Quality and Integrity, Core Science Systems, and Office of Enterprise Information.  The Office of Science Quality and Integrity (OSQI), Core Science Systems (CSS), and Office of Enterprise Information (OEI) are responsible for jointly developing USGS data management policy and collaborating on the development of related guidance and procedures.  The OSQI coordinates with the ADs, RDs, or the entire ELT as needed to address and resolve issues regarding the execution of this policy and related data-management review and approval processes.  The OSQI also maintains and communicates this and other FSP-related policy documents.  The CSS maintains comprehensive guidance and procedures related to data management (refer to the USGS Data Management website).  The OEI is responsible for the records management program that informs data records management policy. 

C.  Science Center Directors.  Science Center Directors or their designees ensure compliance with data-management and approval requirements for data produced in their Centers or offices and consult with their respective ADs, RDs, managers (program and project), scientists, and others on their staff as needed with regard to carrying out data management activities, including ensuring the development of data management plans for all new research proposals and the updating of these plans.  They also assign or ensure the assigning of data managers to oversee or steward the lifecycle activities of their respective data products.

D.  Approving Officials.  Approving Officials, including Science Center Directors (or their designees) and Bureau Approving Officials in the OSQI, ensure that USGS standards for scientific quality are followed by confirming that appropriate review, approval, and release requirements as described in SM 502.8 are met before they grant Bureau approval of the data products for which they have authority to approve.

E. Records Officer.  The Records Officer in the OEI ensures bureau-wide policies, standards, and procedures including records schedules are in place to provide guidance on creating accurate and complete records, maintaining these records throughout the science data lifecycle, and transferring permanent scientific records in accordance with applicable USGS and NARA records management requirements.

 

/s/ Jose R. Aragon                                                                    01/13/2017

__________________________________________            _______________ 

Jose R. Aragon                                                                        Date 

Associate Director for Administration