USGS Data Lifecycle Diagram
Metadata describe information about a dataset, such that a dataset can be understood, re-used, and integrated with other datasets. Information described in a metadata record includes where the data were collected, who is responsible for the dataset, why the dataset was created, and how the data are organized. Metadata generally follow a standard format, making it easier to compare datasets and to transfer files electronically.
Why Do We Need Metadata?
- Metadata create longevity for data.
- Use metadata to understand and re-use data.
- Access to searchable metadata helps avoid data duplication and reduces workload.
- Metadata enables the sharing of reliable information.
- Metadata transcend people and time.
- Data are not complete without a metadata record.
- Identify new partnership opportunities when metadata are shared.
- Ensure organizational investment in data.
- Use mandated Federal metadata standards (see Executive Order 12906).
- Keep all documentation associated with your data.
Metadata are crucial for any potential use or reuse of data; no one can responsibly re-use or interpret data without accompanying metadata that explains how the dataset was created, why, where it is geographically located, and details about the structure and meaning of the data.
There are many uses for metadata, even beyond the simple discovery of datasets. Metadata can be used for understanding data, analysis and synthesis, maintaining longevity of a dataset for an organization, tracking the progress of a research project, and demonstrating the return on investment for research at an institution.
Federal agencies are mandated by Executive Order 12906 to use the Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata. A transition to the ISO Standard is currently occurring, the adoption of which is endorsed by the FGDC.
Metadata are not just for geospatial data. Consider a biological data spreadsheet containing species occurrence information. It can be documented using a metadata standard to answer questions about the data, which may include, but not be limited to, location data, definitions for column headers, and reasons the data were collected.
- Get organized: Writing good metadata begins with being organized.
- Organization seems like a logical step, but many choose to start writing a metadata record before having all of the necessary information. It makes a difference to stop, get organized, and make a plan. Begin by gathering all of your information together, especially if multiple people have the information that you need.
- Use information that is already developed. Information needed for high-quality metadata records is often already written.
- Re-use text written for grant or funding proposals; e.g., abstract, purpose, location of data collection, etc.
- A data dictionary created during data collection and processing can be referenced in the metadata, so have that available to include in your record.
- Select a tool to help you create your metadata.
- Titles: Choose a title for your dataset that incorporates who, what, where, why, and scale.
- Example: Greater Yellowstone Rivers from 1:126,700 U.S. Forest Service Visitor Maps (1961-1983)
- Why is this a good example? This title includes enough information to be informative: Greater Yellowstone (where) Rivers (what) from 1:126,700 (scale) U.S. Forest Service (who) Visitor Maps (1961-1983) (when)
- Writing your Metadata:
- Choose keywords wisely: Consider all of the possible interpretations of your word choices and use a thesaurus to add descriptive terms you may not have otherwise selected.
- Include as many details as you can in the record so that readers can surmise what is in your data before they go further.
- Review: review your metadata for completeness and accuracy.
- It is a good practice to ask someone else to read your metadata. Ideally, another person unfamiliar with the project is able to review your metadata record objectively.
- Check for clarity and omissions. Your metadata records will live for a long time.
- Make revisions based on comments you receive.
- The metadata record should be understandable by anyone, regardless of professional background. Another option is to use experts for particular areas, and only have them review the metadata that are relevant to their expertise. Reviewing metadata in this format will require a little amount of time from multiple people but will likely get more thoroughly reviewed since experts are only reviewing the information that is of interest to them.
- If you are also the only reviewer, it is best to let some time lapse before completing your final product. Often mistakes and omissions are found on a second reading.
- Export your metadata record from the tool you are using in xml format and keep it with your dataset.
- Make sure your record is compliant with FGDC standards. Many tools have validation built in; but in case it does not, import the record into Metadata Parser (http://geo-nsdi.er.usgs.gov/validation/), a USGS tool for metadata validation, and reconcile any errors that are identified.
- Upload your metadata into a clearinghouse so that you can advertise the work you have done, and let others re-use your data for new research purposes; e.g., USGS Core Science Metadata Clearinghouse (http://mercury.ornl.gov/clearinghouse.
- Chatfield, T., Selbach, R. February, 2011. Data Management for Data Stewards. Data Management Training Workshop. Bureau of Land Management (BLM).
- DataONE education modules. Accessed June 13, 2012.