Data Management

Data Release

Publication of scientific data as stand-alone products or in conjunction with the scholarly articles they support is integral to the open data movement. The USGS has developed a path for formally releasing or publishing USGS scientific data called a "data release."

Elements of a Data Release

Elements of a Data Release

The USGS requires 7 elements to release data: 1.) Data management plan, 2.) data, 3.) FGDC-compliant metadata, 4.) DOI, 5.) data & metadata review, 6.) acceptable data repository, and 7.) metadata available through the USGS Science Data Catalog.

Learn More Below

Checklists for Data Release Reviews

Checklists for Data Release Reviews

Checklists are available to help data authors, reviewers, and USGS Center Directors as they work through the review process.

Jump to Checklists

Open Data Overview 

Requirements referenced in the USGS Public Access Plan:

  • The Office of Science and Technology Policy (OSTP) February 22, 2013, Memorandum entitled "Increasing Access to the Results of Federally Funded Scientific Research" requires public access to digital datasets resulting from federally funded research, including datasets used to support scholarly publications.
  • The Office of Management and Budget (OMB), May 9, 2013, Memorandum M-13-13 entitled "Open Data Policy-Managing Information as an Asset" requires agencies to support downstream dissemination activities for all new information created and collected (e.g. using machine readable and open formats, data standards, and common core and extensible metadata).

 

What the U.S. Geological Survey Manual Requires: 

  • SM 502.6 - Fundamental Science Practices: Scientific Data Management
  • SM 502.7 - Fundamental Science Practices: Metadata for USGS Scientific Information Products including Data
  • SM 502.8 - Fundamental Science Practices: Review and Approval of Scientific Data for Release
  • SM 502.9 - Fundamental Science Practices: Preservation Requirements for Digital Scientific Data

 

USGS Data Release Resources 

 

Elements of a Data Release 

Diagram of the elements of a USGS data release: data, metadata, digital object identifier, IPDS, USGS dataset repository, SDC

What constitutes a release of USGS scientific data within USGS?

1. Data Management Plan (DMP)

2. Scientific Data

3. FGDC-Compliant Metadata

4. USGS Digital Object Identifier (DOI)

5. Reviews of Data and Metadata

6. An Acceptable Data Repository

7. Metadata Available in the USGS Science Data Catalog

 

  1. Data Management Plan (DMP) 
     

    Cartoon of a personified data management plan

    For every project, the USGS requires a data management plan. This plan should be written prior to beginning project work, and updated throughout the project. A data management plan focuses on how the data will be handled throughout the project. For example, how will the data be obtained or collected? What is the schedule and budget for data collection? How will the data be quality checked? How will the data be stored, accessed, and protected? A good data management plan provides a strategy for how you will answer all of these questions. Learn more about DMPs at Plan > Data Management Plans.

     
     
     
     
     
     

  2. Scientific Data 

     

    Screencapture of a comma-separated values (CSV) formatted file.

    Screencapture of a comma-separated values (CSV) formatted file.

    Ensure that your data is in open format (CSV, ASCII, GIF, NetCDF, GeoTiff, etc.) to ensure longevity. The data can be released separately or alongside the publication of the scholarly journal it supported. Learn more about file format options at Acquire > Data & File Formats.
     
     
     
     
     
     
     
      

  3. FGDC-Compliant Metadata 
       
    Screenshot of the Metadata Wizard 2.0

    The MetadataWizard is a useful tool designed to facilitate FGDC metadata creation for spatial and non-spatial data sets. Learn more about the MetadataWizard.

    Metadata describes information about a dataset, such that a dataset can be understood, re-used, and integrated with other datasets. Information described in a metadata record includes where the data were collected, who is responsible for the dataset, why the dataset was created, and how the data are organized. Learn more about metadata creation at Describe > Metadata.

    Once you have created metadata, it needs to:

  4. USGS Digital Object Identifier (DOI) 
         
    Screencapture of the USGS Digital Object Identifier (DOI) Creation Tool.

    The USGS Digital Object Identifier (DOI) Creation Tool  creates and manages USGS DOIs.

    Persistent Identifiers are globally unique numeric and/or character strings that enables a user to access a digital resource via a permanent, long-term link. While there are several standard persistent identifier systems, the USGS uses Digital Object Identifiers (DOI) for its information products. All data being released in USGS must have DOIs. Digital Object Identifiers are especially useful when citing your data. Like a publication, cite and receive credit for your data. Learn more about DOIs at Publish/Share > Digital Object Identifiers.

    Once you have created a digital object identifier, it needs to:

    • appear in your metadata record
    • be included in your publication as a data citation, if it applies
    • managed in the USGS DOI Creation Tool, if the data location changes
       
       
       
       
       
       
        
       
  5. Reviews of Data and Metadata 
          

    Any data approved for release by USGS, whether provided to support a scientific publication or for use by the public or by cooperators, must be reviewed and approved. Review is necessary to ensure that the data are well documented and are complete, consistent, accurate, and precise as needed to achieve the goals for which they were created. 

    Both metadata and data must be reviewed. The reviews may be carried out by one or more people, but reviewers will need to examine both data and metadata in order to understand the data and to ensure that the metadata accurately describe the data. Metadata, data review documents, and reconciliations, are maintained in an internal USGS system.

    Screenshot of Data Release Checklist for Center Directors

    Data Release Checklist for Center Directors

      
    Data Release Review Checklists 

    A number of checklists are available to help data authors, reviewers, and USGS Center Directors as they work through the review process.

    Checklists for data authors:

    Checklists for reviewers of data and metadata:

    Checklist for USGS Center Directors approving a data release:

    Here are some other useful resources:

  6. An Acceptable Data Repository 
        
    A ScienceBase item.

    ScienceBase is a certified USGS Trusted Digital Repository, making it a common location to store and maintain USGS data (Public domain).

    Data funded by the USGS must be released on a government server. This can take the form of a Science Center website, an approved data application, or a repository. Regardless, the release point should represent the components of a USGS “Trusted Digital Repository." Data releases in USGS will not all look the same.

    USGS ScienceBase offers one possible way to store and maintain your data, and offers assistance in data release.

    See USGS ScienceBase Data Release FAQs.

    Learn more about data repositories at Preserve > Repositories.
     
     
     

  7. Metadata Available in the USGS Science Data Catalog 
         
    Screenshot of the SDC homepage

    The USGS Science Data Catalog provides a complete list of official USGS data products (Public domain).

    Your released data must be shared with the public and research communities through the USGS Science Data Catalog. This metadata catalog provides seamless access to USGS research and monitoring data from across the nation. Users have the ability to search, browse, or use a map-based interface to discover data. Data providers are assured the USGS Science Data Catalog meets White House Open Data reporting requirements for USGS; provides a Search and Discovery Tool that allows for metadata retrieval, visualization, download, and linking back to original data providers; offers a single source for USGS to serve its metadata to data.doi.gov, Data.gov, and OMB; helps ensure that USGS metadata meet requirements.

    Learn more about the Science Data Catalog at Publish/Share > Data Catalogs and Portals.

 
Examples of Data Releases Across USGS 

 

Can't open pdf files? Get Adobe Acrobat Reader.

Fitting it all together: 

Figuring out a workflow for data release in USGS can be interesting. Here is an example workflow for authors using ScienceBase to release their data: