Updated January 2022
Introduction
These standards are intended to assist in selecting, specifying, building, operating, or enhancing trusted repositories for USGS digital scientific assets. This document includes a table for use by Bureau scientists and management (in collaboration with information technology (IT) staff) in the technical evaluation of systems for preserving these digital assets. The table establishes the minimum USGS standards for a trusted digital repository (refer to Level Three in the table below). The standards in the table are based on material from the Library of Congress-sponsored National Digital Stewardship Alliance (National Digital Stewardship Alliance, 2013). These standards do not cover physical data or address topics such as preservation policies, funding, or organizational competency and longevity, which are critical for data preservation but beyond the scope of this document.
For purposes of this document, important definitions related to preservation of USGS digital assets are as follows:
- Long-term: A period of time long enough for there to be concern about the loss of integrity of digital information held in a repository, including deterioration of storage media, changing technologies, support for old and new media and data formats, and a changing user community. This period extends into the indefinite future.
- Sustainable format: The ability to access an electronic record throughout its lifecycle, regardless of the technology used to create it. A sustainable format is one that increases the likelihood of a record being accessible in the future.
- Checksums: A checksum is a short mathematical digest of a file, which changes if any bit in the file changes. Checksums are used to detect unexpected changes in file content. Federal agencies, including the USGS, should use the following National Institute of Standards and Technology (NIST) approved checksums for new systems: SHA–224, SHA–256, SHA–384, and SHA–512.MD5 and SHA–1 checksums are widely used but not approved for new systems. For more information on checksums, refer to http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html.
Elements to Consider for Digital Asset Preservation
When considering how to preserve digital assets you should address the following technical elements, for which standards are provided in the table below:
- Storage and Geographic Location – Storage systems, locations, and data duplication to prevent loss.
- Data Integrity – Procedures to prevent, detect, and recover from unexpected or deliberate changes.
- Information Security – Procedures to prevent human-caused corruption, deletion, and unauthorized access.
- Metadata – Documentation to enable contextual understanding and long-term usability.
- File Formats – File types, structures, and naming conventions to aid long-term preservation and reuse.
- Physical Media – Basic recommendations to reduce obsolescence risks that can threaten the readability of physical media.
Levels of Digital Asset Preservation
There are four levels of digital asset preservation:
- Level One: Level One is the minimum criteria and activities needed to maintain digital assets through the life of a research project.
- Level Two: To continue improving upon repository functionality, implement Level Two elements after all Level One elements are in place.
- Level Three: Implement Level Three elements after all Level Two elements are in place. This is the USGS trusted digital repository minimum criteria for all long-term preservation records.
- Level Four: Level Four is the optimum level for which USGS should strive.
The Levels of Digital Preservation table below is based on a left-to-right progression. For each element, the columns describe four levels of increasing assurance for digital assets to be preserved. Additional guidelines are as follows:
- Each level adds requirements to the previous levels.
- To enhance an existing digital data repository, upgrade all elements to the same level.
- To achieve designation as a trusted digital repository, the repository must meet at least Level Three.
- For highest assurance of data preservation, specify all elements at Level Four.
PDF Version Note: use this version when printing a hard copy of the tabulated formatted information.
Levels of Digital Preservation | ||||
---|---|---|---|---|
Element | Level One | Level Two | Level Three | Level Four |
Storage and Geographic Location |
|
|
|
|
Data Integrity |
|
|
|
|
Information Security |
|
|
|
|
Metadata |
|
|
|
|
File Formats |
|
|
|
|
Physical Media |
|
|
|
|
Derived from Library of Congress, National Digital Stewardship Alliance, NDSA Levels of Digital Preservation: Version 1, February 2013. |
Roles and Responsibilities
A repository manager or project chief ensures that all the table elements are addressed, although others, such as data managers or IT specialists, may be responsible for implementation and operation activities.
Scientists and research staff will use the table criteria to recommend the suitability of a potential repository for preserving digital assets.
Management officials will use the table criteria for reviewing and approving the selection of trusted digital repositories.
In consultation with USGS scientist and managers, IT staff will use the table criteria for building, enhancing, or operating trusted digital repositories.
Additional Information
Additional information on preservation of USGS digital assets can be found here.