Data Management

Metadata

Metadata describe information about a dataset, including who, what, where, when, why, and how, so that it can be understood, re-used, and integrated with other datasets. Metadata records follow a standard format to enable interoperability. 

Why Do We Need Metadata?

Why Do We Need Metadata?

Metadata are crucial for any use or reuse of data; no one can responsibly re-use or interpret data without metadata that explains how the dataset was created, why, where it is geographically located, and details about the structure of the data.

Uses for Metadata

Uses for Metadata

Metadata are used for enabling data discovery, understanding data, analysis and synthesis, maintaining longevity of a dataset, tracking the progress of a research project, and demonstrating the return on investment for research at an institution.

Getting Started with Metadata Creation 

Gather content for the metadata record

  • Understand what goes into a metadata record (e.g. title, abstract, methods, keywords, etc.).

  • Use the Metadata Questionnaire [PDF] or Metadata in Plain Language to gather content for building a metadata record or use metadata creation tools which will ask you the same questions about your data.

 

Federal agencies are mandated by Executive Order 12906 to use metadata standards endorsed by the Federal Geographic Data Committee (FGDC) below:

Both FGDC-CSDGM and ISO require metadata to be formatted in Extensible Markup Language (.xml) although a stylesheet can be applied over the XML to make it easier to read. Learn more about XML for Advanced Users.

 

FGDC-CSDGM Standard

Examples of metadata records in FGDC-CSDGM for different types of information products. View the metadata record in its native XML code or with a stylesheet applied to be easier to read.

 

ISO Standard

An example of a metadata record in ISO 19115-2. Please note that it may contain only certain sections of the ISO standard.

  • USGS Barnegat Bay hydrodynamic model for March-September 2012 [XML][Stylesheet]

For more information about metadata as it pertains to the USGS data release process, visit Metadata for Scientific Data FAQs.

 

Screenshot of the USGS Online Metadata Editor

Screenshot of the USGS Online Metadata Editor available at https://www1.usgs.gov/csas/ome

Tools for Creating Metadata Records 

The following free tools create or edit FGDC CSDGM metadata in XML. For a wider selection of tools see the FGDC Metadata Tools. For a list of tools for the ISO metadata standard, refer to the FGDC ISO Metadata Editor Review.

Screenshot of the Metadata Wizard 2.0

The MetadataWizard is a useful tool designed to facilitate FGDC metadata creation for spatial and non-spatial data sets.

  • USGS Online Metadata Editor (OME) - An online form for USGS staff to create FGDC-CSDGM by answering simple questions about your data. Best for biological and non-biological datasets. Login to start new records or upload and edit existing ones. Save completed or ongoing records for later or download directly to your computer.

     

  • USGS Metadata Wizard - A Python toolbox in Esri ArcGIS Desktop for creating FGDC-CSDGM metadata for geospatial datasets. The tool ingests geospatial files and through a semi-automated workflow, creates and updates metadata records in Esri’s 10.x software. Best for geospatial data (e.g. raster and shapefiles) and tabular data (e.g. Esri geodatabase or database file). Comma separated value files can be used but must first be converted into Esri formats.
     
  • USGS Metadata Wizard 2.x - a cross-platform, desktop application modeled off of the original Metadata Wizard to create CSDGM metadata. This version of the Metadata Wizard does not have Esri dependencies and provides support for additional tabular data file formats.
     
  • USGS TKME - A Windows platform tool for creating FGDC-CSDGM which can be configured for Biological Data Profile and other extensions. The software program is closely aligned with the Metadata Parser, and can be configured for French and Spanish.
     
  • mdEditor - create ISO and FGDC metadata with this web-based tool
     
  • Data dictionary conversion service - convert a data dictionary table to/from metadata format (instructions).
     
  • USDA Metavist - A desktop metadata editor for creating FGDC-CSDGM for geospatial metadata. Includes the Biological Data Profile (version 1.6). Produced and maintained by the USDA Forest Service. Download the USGS Alaska Science Center (ASC) Metavist User Guide [PDF] to learn more about the tool and ASC best practices for authors.
     
  • Microsoft XML Notepad - A simple intuitive user interface for browsing and editing XML files. Does not automatically produce FGDC-CSDGM records but allows easy editing and validating of existing metadata records. See Advanced Users to learn how to configure this tool.

 

Best Practices for Metadata Creation 

  • Gather all information together, especially if multiple people have information that you need.
     
  • Use information that is already developed.
    • Re-use text from grant or funding proposals (e.g. abstract, purpose, date, etc.).
    • Reference the data dictionary that was used during data collection and processing to complete the Entity & Attribute section of a CSDGM metadata record. 
       
  • Choose a descriptive title for your dataset that incorporates who, what, where, why, and scale.
    • Example: Greater Yellowstone Rivers from 1:126,700 U.S. Forest Service Visitor Maps between 1961-1983
       
  • Choose keywords wisely: Consider all of the possible interpretations of your word choices and use a thesaurus to add descriptive terms you may not have otherwise selected.
     
  • Placement of the DOI for the dataset in a CSDGM metadata record
    • The DOI should go in the primary <onlink> in the Citation Information section.
    • Make sure that the format of the DOI is a URL, (not of the format doi:10.5066/ABCD123). Your DOI should be entered in the format https://doi.org/10.5066/ABCD123. If your DOI is not entered as a URL, your metadata record will be rejected by catalogs such as the USGS Science Data Catalog and Data.gov
       
  • Placement of the DOI for the related publication in a CSDGM record
    • The related publication is usually cited as a Larger Work Citation in the metadata. The Larger Work Citation has its own <onlink> field, and this is the correct location for the publication's DOI.
    • Make sure that the format of the DOI is a URL, (not of the format doi:10.3133/ABCD123). Your DOI should be entered in the format https://doi.org/10.3133/ABCD123. If your DOI is not entered as a URL, your metadata record will be rejected by catalogs such as the USGS Science Data Catalog and Data.gov.     
       
  • Include as many details as you can in the metadata record for future users of the data.
     
  • Review your metadata for completeness and accuracy.
    • Ask someone unfamiliar with the project to review your metadata objectively.
    • Check for clarity and omissions.
       
  • Use the best practices described in the Systems Level Applications or Collections [PDF] for large data systems or when describing "collections" of datasets.

 

Validating Metadata Records 

You must validate metadata to ensure it has been created properly and all required elements have been filled in. Validation compares the metadata standard to the XML metadata record to ensure it conforms to the structure of the standard. See best practices for Checking Metadata with Data [PDF] with FGDC-CSDGM metadata. Many metadata creation and editing tools (such as OME and Metadata Wizard) validate automatically so a second validation may not be necessary.

Tools:

Screencapture of the USGS Metadata Parser.

The USGS Metadata Parser validates XML metadata records against the FGDC-CSDGM standard (Public domain).

  • USGS Metadata Parser – A tool that validates XML metadata records against the FGDC-CSDGM standard and generates error reports if any. Good for geospatial and non-geospatial datasets. Users can view XML metadata records in easy-to-read formats (html, text). It is multilingual (English, French and Spanish) and can be configured for the Biological Data Profile and other extensions. For advanced users, learn how to Run MP from the Command Line window [PDF].
     
  • Microsoft XML Notepad – The tool offers the ability to validate records but requires a schema package. See Advanced Users to learn more.

 

My Metadata is Created, What’s Next? 

  1. USGS policy requires a formal review of the data and metadata if intended as a USGS data release.
     
  2. Package your data and metadata together whenever possible since the metadata record is critical to understanding the data.
     
  3. Work with your organization to identify how metadata should be shared or visit Publish and Share for more information. Sharing metadata improves discoverability, access, and reuse of the data. The USGS Science Data Catalog is the approved mechanism for serving USGS metadata to data.doi.gov, data.gov, and geoplatform.gov, etc.

 

Advanced Users 

Microsoft XML Notepad - An XML editor that can help create and edit metadata records directly in XML code. The software is free to download but only available for PC systems.

 

EML to CSDGM-BDP Transform [XSL] - This transform file can transform metadata in the Ecological Metadata Language (EML) standard to FGDC-CSDGM Biological Data Profile. After transformation, validate the metadata record and check to ensure content was adequate transferred.

 

What the U.S. Geological Survey Manual Requires: 

The USGS Survey Manual chapter SM 502.7 Fundamental Science Practices: Metadata for USGS Scientific Information Products including Data provides metadata requirements for USGS scientific information products and scientific data that are Bureau-approved for release.

SM 502.7 further specifies metadata must accompany all USGS scientific data and other information products. Metadata records are to be developed in a standardized way that enables users to understand the context and to evaluate the usefulness of the data or information product. Metadata records for scientific data must comply with standards such as the FGDC Content Standard for Digital Geospatial Metadata, the International Organization for Standardization suite of standards, or other USGS endorsed FCDC standards. A minimum of one metadata review by a qualified reviewer is required for all USGS scientific data and other information products approved for release.

The USGS Survey Manual chapter SM 502.8 Fundamental Science Practices: Review and Approval of Scientific Data for Release discusses when metadata requirements apply for release of scientific data.

SM 502.8 further specifies scientific data approved for release must comply with the metadata requirements as described in SM 502.7, and the metadata must be deposited in and shared through the USGS Science Data Catalog. Reviews of the data and the associated metadata are required, and these reviews must be documented in the internal USGS Information Product Data System (IPDS).

For additional guidance, please refer to the Fundamental Science Practices FAQ: Metadata for USGS Scientific Data.

 

Recommended Reading

 

References

  • Chatfield, T., Selbach, R. February, 2011. Data Management for Data Stewards. Data Management Training Workshop. Bureau of Land Management (BLM).
  • DataONE education modules. Accessed June 13, 2012.