Spring 2022 topics include information on our migration from Confluence to SharePoint, making sure your metadata record gets into the SDC, a tip on non-ASCII characters, and a featured data release on a novel interaction between intraguild predators.
Migration from Confluence to SharePoint - Training and Resources
This past winter, there was a big push to migrate USGS content from Confluence to SharePoint. Previously, the ScienceBase Data Release (SBDR) Team maintained a Confluence site to document upcoming trainings and post recordings of past events. We have now migrated that content to a new ScienceBase Data Release SharePoint Site.
This site is a great starting place for people who are new to the SBDR process. There are links to the SBDR Tool, the SBDR instructions, and our new SBDR Training & Resources page. On the SBDR Training & Resources page, you can see when we will be hosting our next General SBDR Training and Revision Training Events, as well as connection information for joining the events. You can also find recordings from previous events and notes from the question-and-answer sessions.
Is there missing information or additional resources that would help you with the SBDR process? Let us know by emailing email@example.com!
Upcoming Updates to the SBDR Tool
Updates are coming soon to the ScienceBase Data Release (SBDR) Tool to support looking up authors and importing their ORCIDs from Active Directory or from our new non-USGS author database. This people lookup will help us improve the quality of the metadata that we maintain in our digital object identifiers (DOIs). Currently, we have a lot of messy data about people. In the DOI Tool, we sometimes have multiple records for the same individual with conflicting information and we can't always tell if the multiple records represent the same person. Messy data are very difficult for people and computers to resolve, and make it challenging for our USGS systems to communicate. For example, if we don't have the correct information in a DOI record for an author, that product will not show up on their USGS staff profile. It's also impossible for us to understand who our USGS staff are collaborating with on products, especially non-USGS co-authors.
The updates to the SBDR Tool will help us keep our author data clean. These updates will include uniquely identifying authors as they are entered into the system - both USGS and non-USGS authors - through a people lookup service. The people lookup service is a separate service that we are integrating into the Tool. The lookup service will work for both the IPDS Autofill feature and for entering authors manually.
If you work with non-USGS authors on data products, you will be asked to enter information about those authors into the non-USGS author database. The more information that you can provide about these authors, especially their ORCIDs or professional email addresses, the easier it will be to look them up in the future. If you have a list of non-USGS authors that you or your Science Center commonly collaborate with, please send us their information in advance (e.g., Full Name, ORCID, Email, Affiliation) and we can bulk load them into the non-USGS author database. Contact firstname.lastname@example.org for more information!
Why Isn't My Metadata Record in the SDC?
USGS policy requires that metadata for approved USGS scientific data must be deposited in and shared through the USGS Science Data Catalog (SDC). The SDC provides access to data across multiple USGS repositories. If you’re releasing data through the ScienceBase data release process, your metadata will automatically be sent to the SDC by the ScienceBase Data Release Team within 24 hours of your data going public. However, not all published data releases make it into the SDC due to specific requirements not being met.
If your data release has been published, but you don’t see the metadata in the SDC, your metadata record may be missing one of the following SDC requirements:
Title is one of the most important pieces of information in a metadata record. A title must be present in the metadata record (.xml file) to be valid and indexed in the SDC. Best practices suggest using a title that incorporates who, what, where, when, and scale. To be admitted to the SDC, titles must be at least five characters.
Metadata contact electronic mail address:
The metadata contact email address of the organization or individual author must be present and valid. Weblinks are not valid email addresses and will prevent your metadata record from being harvested by the SDC.
The metadata date is the date that the metadata was created or last updated. This date must be present and valid in the metadata . A valid metadata date includes year, month, and day in the following format: ‘20220430’.
The metadata record must pass validation against the metadata standard to ensure it has been structured properly and all required elements have been filled in. When you validate a metadata record, it compares the metadata's XML content to the metadata standard to ensure it conforms to the structure and is thus able to be parsed by SDC. See best practices for Checking Metadata with Data [PDF] for FGDC-CSDGM metadata. Please be aware that many metadata creation and editing tools (such as OME and Metadata Wizard) can validate these records automatically.
The SDC requires a persistent identifier (PID) that’s been registered and is unique for every metadata record in the Catalog. This persistent identifier enables the SDC and downstream federal data catalogs to uniquely identify and recognize metadata records. The ScienceBase Data Release Team will register and add PIDs to all ScienceBase data releases automatically upon publication, so authors don’t have to worry about this step.
The requirements above need to be met or the metadata attached to the release will not be properly collected by the SDC. If one of the requirements above was missed when the data release was created, please reach out to the ScienceBase Data Release Team (email@example.com) so that a solution can be determined.
Subscribe to the ScienceBase Mailing List for Quarterly Updates.