USGS - science for a changing world

USGS Data Management

Describe / Metdata > Data Dictionaries and Thesauri

U. S. Geological Survey Data Lifecycle Diagram Plan Acquire Preserve Publish/Share Describe (Metadata and Documentation) Manage Quality Backup and Secure
USGS Data Lifecycle Diagram

Data Dictionaries and Thesauri

Data Dictionaries contain structured data names, and thesauri contain terms that make your data more easily discovered.

Definition: Data Dictionary

Key Points

  • A Data Dictionary is a repository of information that defines and describes a data resource.
  • A Thesaurus is a structured list of preferred terms around a subject.
  • Use widely known keywords and tags for your data in order to make your data more searchable and discoverable.
  • Find preferred terms and keywords with a thesaurus (e.g., USGS Biocomplexity Thesaurus).

According to the DOI Data Management Guide (2008 [.doc file]), a data dictionary is a repository of data (metadata) defining and describing the data resource.

Definition: Thesaurus

A thesaurus is a structured list of preferred terms or subjects that indicate relationships between those terms. Preferred terms are focal points where all information about a concept is collected. Relationships between preferred terms can be broad, narrow, or related in another way.

A thesaurus also indicates non-preferred terms, which are terms indexers and searchers should not use. A good thesaurus makes clear what a term is meant to cover by providing preferred terms, their relationships with other preferred terms, and non-preferred terms.

Data Dictionary Definitions

Example:

Entity: Fish Measurements - A sample of the physical measurements of rainbow trout in Lake Superior, MN collected on 07/14/2010.

Attributes:

  • Attribute Type: fish_totl
  • Attribute Definition: Fish Total Length - Measured total length (cm) of the fish from mouth to tip fin. Mouth shut and fin pinched closed.
  • Domain Range: 0.00-300.00;
    -999 = NA

Entity definitions:

  • Defines a person, place, or thing about which data can be stored
  • Must be clearly understood before attributes can be named or defined

Data element (attribute) definitions:

  • Describe the inherent nature of the data
  • NOT the entity that the attribute contains information about
  • NOT the uses of the data (where, when, how, or by whom)
  • NOT the codes and values the codes represent

Tools

  • USGS Biocomplexity Thesaurus Project
  • Description:
    The Biocomplexity Thesaurus Project is a thesaurus and dictionary database of terms and concepts in nearly every scientific field. The Biocomplexity Thesaurus serves as a controlled vocabulary for facilitating improved access and retrieval of data and information. Users can query the thesaurus for matching and related terms both specific and broad.
    URL:
    http://www.usgs.gov/core_science_systems/csas/biocomplexity_thesaurus/index.html

Example USGS Data Dictionaries

Recommended Reading

References

Can't open pdf files? Get Adobe Acrobat Reader.

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http://www.usgs.gov/datamanagement/describe/dictionaries.php
Page Contact Information: Email Us
Page Last Modified: Monday, November 24, 2014