USGS - science for a changing world

USGS Data Management

Plan > Data Standards

U. S. Geological Survey Data Lifecycle Diagram Plan Acquire Preserve Publish/Share Describe (Metadata and Documentation) Manage Quality Backup and Secure
USGS Data Lifecycle Diagram

Data Standards

Standards make it easier to create, share, and integrate data by making sure that there is a clear understanding of how the data are represented and that the data you receive are in a form that you expected.

What are Data Standards?

Key Points

  • Standards are rules establishing how data are described and recorded in a consistent format.
  • Using standards makes data more usable to more than just the project or person that created the data.
  • Standards are useful for integrating data from multiple resources. If the various sources agreed upon a standard to begin with, this saves time reconciling any differences.
  • When collecting new data, try to find data standards for the type of data you are collecting.
  • Examples of existing data standards come from FGDC, USGS Core Science Analytics and Synthesis (CSAS) Program, National Geospatial Program, etc.

Data standards are the rules by which data are described and recorded. In order to share, exchange, and understand data, we must standardize the format as well as the meaning.

Why do we need Data Standards?

Using standards makes using things easier. For example, let's say you need a AAA battery for your flashlight. You don't need to worry about the make of the battery, since all AAA batteries are the same size - because they are produced to a standard. You don't need to worry about getting a specific brand of AAA battery, since all AAA batteries will work in your flashlight.

The Bureau of Land Management notes that "Standards provide data integrity, accuracy and consistency, clarify ambiguous meanings, minimize redundant data, and document business rules." Utilizing data standards allows the agency to move from "project-based" data files to "enterprise" data files - and vice versa. In other words, the data become usable to more than just the project or person that created the data, because you know the data will be in an expected format and you know what is represented by the data.

If different groups are using different data standards, combining data from multiple sources is difficult, if not impossible. If we go back to the case of needing a battery for our flashlight, if there were no standards for AAA batteries, then we wouldn't be able to use just any AAA battery. We'd have to find one specific for our make and model of flashlight. You'd have to have many sets of AAA batteries in your house, one that worked for each item, instead of one set that works in all applicable cases.

Data Standard Example:

Name
Latitude
Code Name
cor_lat_meas
Format
CHAR(16)
Definition
Coordinate Latitude is the angle between the plane of the reference ellipsoid's equator and a normal to the ellipsoid surface. It is formatted by direction, degrees, minutes, decimal seconds (60 24 32.56 N). This item is analogous to the 'Y' value of a rectangular coordinate system.
FGDC Alias
Y Coordinate
FGDC Definition
This is the Y Coordinate value or northing for a coordinate set.

Another example that relates to data is how we format a date value. We can note a date as:

  • April 2, 1974
  • 04-02-74
  • 04/02/1974
  • 4/2/74
  • 19740402
  • 04021974 - is this April 2 or February 4?
  • 2 April 1974

If you were trying to integrate datasets from different sources, each of which used a different format for their date variable, it would be a much harder task since you would have to convert the dates into a common format before you could integrate the data. If everyone agreed upon what standard they were going to use for dates, then you wouldn't have to do this extra step.

A structured data element name gives us:

  • An informative name
  • A description and definition
  • The ability to assign unique, consistent names
  • The ability to identify the natural relationships of data
  • The ability to identify all of the uses of a data element

Where can I find Data Standards? Can I develop my own Data Standard?

When collecting new data, existing Data Standards should always be used where applicable. The FGDC develops geospatial Data Standards only when no equivalent voluntary consensus standards exist, in accordance with OMB Circular A-119. Some sources for Data Standards are:

Data Administrators / Data Stewards

These are individuals who design the data and control the way information is represented, and ensure that data can be used for all business needs. See Plan > Data Stewardship for more information.

Data Dictionaries

Data Dictionaries contain structured data names for people to use. See Describe > Data Dictionaries for more information.

What the U. S. Geological Survey Manual Says:

The USGS Manual Chapter 502.2 - Fundamental Science Practices: Planning and Conducting Data Collection and Research addresses data and metadata standards:

"The data collected and the techniques used by USGS scientists should conform to or reference national and international standards and protocols if they exist and when they are relevant and appropriate. For datasets of a given type, and if national or international metadata standards exist, the data are indexed with metadata that facilitate access and integration."

References

  • Chatfield, T., Selbach, R. February, 2011. Data Management for Data Stewards. Data Management Training Workshop. Bureau of Land Management (BLM).

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http://origin-www.usgs.gov/datamanagement/plan/datastandards.php
Page Contact Information: Email Us
Page Last Modified: Tuesday, April 08, 2014