An official website of the United States government. Here's how you knowHere's how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock () or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Latest Earthquake | Chat Share
Data standards are the guidelines by which data are described and recorded. In order to share, exchange, combine and understand data, we must standardize the format as well as the meaning.
Data standards are produced by a consensus of subject matter experts and are ratified by a standards authority such as the International Organization for Standardization (ISO) and the Federal Geographic Data Committee (FGDC).
Data stewards and data managers can help determine the appropriate data standards to use for a project. Researchers are responsible for implementing the use of data standards in their projects.
Standards make it easier to create, share, and integrate data by ensuring that the data are represented and interpreted correctly. Standards also reduce the time spent cleaning and translating data. Cleansing “dirty data” is a common barrier encountered by scientists, taking 26% of data scientists’ on-the-job time (Anaconda, 2020). For example, when integrating datasets from different sources, each of which used a different format for their date variable (e.g., April 2, 2024, 04-02-24, 04/02/2024), it would be a time consuming task to interpret and convert the dates into a common format before integrating the data.
Hear from a USGS Biologist about how data standards increase efficiency in research.
Dataset-level standards specify the scientific domain, structure, relationships, field labels, and parameter-level standards for the dataset as a whole. A dataset-level standard is normally documented with a data dictionary (link to data dictionary page). See below for examples of formal dataset-level standards:
Biological: Darwin Core (URL: https://dwc.tdwg.org/)
Climate and Forecast: CF-Conventions (URL: https://cfconventions.org/)
Mapping: US Topo Maps (URL: https://doi.org/10.3133/tm11B2)
Parameter-level standards define the format and units for a given parameter or field within a dataset and help users correctly interpret the values. Parameter-level standards should be adopted at the time of data collection, that is when values in a field are created or recorded. If standardization of a parameter in an existing dataset will result in any loss of original detail or information, a best practice is to retain the original parameter and add a separate field for the standardized parameter. The Darwin Core standard, for example, provides verbatim fields for this purpose.
Below are some examples of common earth and biological science data parameters and data standards.
Date / time
Data Standard: Divisions of Geologic Time – Major Chronostratigraphic and Geochronologic Units
Data Standard: IUPAC-IUGS common definition and convention on the use of the year as a derived unit of time (IUPAC Recommendations 2011): http://doi.org/10.1351/PAC-REC-09-01-22
Geographic location descriptors
Data Standard: ISO 6709:2008
Format: ±90.00 and ±180.00 (precision documented by number of decimal places and contingent upon equipment used)
Example: Latitude: 42.3300; Longitude: -98.1449
Geographic names / codes
Data Standard: ANSI INCITS 446-2008
Browse Features in the Data Standard: Geographic Names Information System (GNIS)
Example: Name: Amicalola Falls; ID: 330601
Data Standard: Hydrologic Unit Codes (HUC)
Example:Watershed Name: Upper Kennebec; HUC: 01030001
Data Standard: ISO-3166
Example: Full name: the United States of America; Alpha-3 code: USA; Numeric code: 840
U.S. state and county codes
Data Standard: Federal Information Processing Standards (FIPS)
Example: County Code: 01001; County Name: Autauga; State Code: 01; State Name: Alabama
Vegetation Classification: United States National Vegetation Classification (USNVC)
Biological Taxonomy: Integrated Taxonomic Information System (ITIS)*
* If ITIS does not address your nomenclature requirements, (contact the ITIS team (email@example.com)); and/or reference another appropriate taxonomic authority .
Table Caption: This excerpt from an occurrence record for a U.S. native bee species uses parameter-level standards recommended by the Darwin Core dataset-level standard (the full record can be viewed at https://www.gbif.org/occurrence/1456598984).
Most dataset-level standards (see above) also offer guidance on how to encode data. Data encoding standards define the rules for structuring and organizing data for use in a given context. These standards ensure that when applications read data, the information and context is preserved (OGC, 2020a). Data encoding standards are generally associated with a file format (see File Formats page). Researchers should use universal, open source data encoding standards and long-term access, open file formats whenever possible. Character encoding standards, such as Unicode Transformation Format (UTF-8) ensure that characters in the data are correctly interpreted.
Below are some examples of open data encoding standards used in the earth and biological sciences. Abbreviations and acronyms are defined at the end of this section.
GeoTIFF and Cloud Optimized GeoTIFF: provides the rules for describing geographic image data using the TIFF file format
GeoJSON: defines the rules for describing geographic features using JSON.
NetCDF: supports electronic encoding of geospatial data, specifically digital geospatial information representing space and time-varying phenomena, using the HDF file format.
OGC GeoPackage: defines the rules for what goes into the GeoPackage file format, which is an alternative to Esri’s proprietary yet popular Shapefile* format. *Although proprietary, the technical specifications are open.
NARA RFC 4180: There is no single encoding standard for creating CSV files; however, the Library of Congress uses the NARA RFC 4180 as the encoding format specification to define the structure of CSV files.
OGC Web Map Service: allows users to remotely access georeferenced map images via HTTPS requests.
OGC Web Coverage Service: allows users to access online geospatial data in multiple raster-based data formats (e.g. GeoTiffs, .img, ENVI (.hdr) file types).
GML Web Feature Service: allows users to access online geospatial data at the feature level using formats such as Shapefile, GML, etc.
Abbreviations and Acronyms
Parameter-level and dataset-level data standards should be documented in the accompanying data dictionary and metadata record. For example, if following the Content Standard for Digital Geospatial Metadata (CSDGM), the dataset-level data standards can be documented in the Entity and Attribute Overview Description section of the metadata and the parameter-level data standards can be documented in the Entity and Attribute Detailed Description for each attribute described in the metadata record.
For more on metadata and data dictionaries, see Metadata Creation and Data Dictionaries.
In any given Federal agency, there can be multiple groups of experts that produce standards and authorities that approve them, often based on the science topic. There is no single group that establishes or recommends standards for the USGS.
International Organization for Standardization (ISO)
USGS National Geospatial Program Standards and Specifications
Federal Geographic Data Committee (FGDC) National Data Standards Publications
FGDC Standards Working Group
Open Geospatial Consortium
Community Guidelines / Norms
Standards often evolve out of communities of practice coming together and agreeing on common practices. Here is an example of an evolving best practice that hasn’t been formally ratified by a standards authority. Does your USGS project use community guidelines that you’d like us to link here? Contact GS_Data_Management@usgs.gov.
Interpreting and reporting 40Ar/39Ar geochronologic data
The USGS Survey Manual Chapter 502.2 - Fundamental Science Practices: Planning and Conducting Data Collection and Research addresses data and metadata standards:
"The data collected and the techniques used by USGS scientists should conform to or reference national and international standards and protocols if they exist and when they are relevant and appropriate. For datasets of a given type, and if national or international metadata standards exist, the data are indexed with metadata that facilitate access and integration."
USGS Survey Manual Chapter SM 502.6 - Fundamental Science Practices: Scientific Data Management Foundation specifies that a data management plan will include standards and intended actions as appropriate to the project for acquiring, processing, analyzing, preserving, publishing/sharing, describing, and managing the quality of, backing up, and securing the data holdings.
Anaconda, 2020, State of Data Science: Moving from Hype Toward Maturity, Anaconda, accessed March 12, 2021, at https://www.anaconda.com/state-of-data-science-2020?utm_medium=press&utm_source=anaconda&utm_campaign=sods-2020&utm_content=report.
Internet Engineering Task Force, 2021, Geographic JSON (geojson), accessed April 5, 2021 at https://datatracker.ietf.org/wg/geojson/about/.
OGC, 2020a, Data Encoding Standards, accessed September 8, 2020, at http://opengeospatial.github.io/e-learning/data-encoding-standards/basic-index.html.
OGC, 2020b, GeoTIFF Standard, accessed September 8, 2020, at https://www.ogc.org/standards/geotiff.
RFC 4180, accessed September 8, 2020, a https://tools.ietf.org/html/rfc4180.
Web Coverage Service, accessed October 5, 2020, at http://learningzone.rspsoc.org.uk/index.php/Learning-Materials/Introduction-to-OGC-Standards/3.2-Web-Coverage-Services-WCS.
Page last updated 5/3/21.