Skip to main content
U.S. flag

An official website of the United States government

ScienceBase Updates - Summer 2020

Summer 2020 topics include information on linking publications and data, data manager resources, including scripts, a new autofill in IPDS feature in the ScienceBase Data Release Tool, standardizing publication dates, a tip on editing your display name in the ScienceBase directory, and a featured data release on landslides triggered by an earthquake.

ScienceBase Updates Header
ScienceBase Updates Header

Linking Publications and Data

Data are often connected to other research outputs, particularly publications. The USGS Science Data Management (SDM) Branch is working to ensure that these connections are documented and available to users of our USGS data.

Related Primary Publications 

One of the most important connections is the primary publication that is written to describe the initial data collection and the first analyses of the data. This publication can provide users with additional information to help them understand the purpose and scope of the data. The SBDR team is calling this publication the "Related Primary Publication." We collect information about known related primary publications when an author first starts a data release using the ScienceBase Data Release Tool. In many cases, authors may only be able to provide the IPDS number for the related primary publication when they are publishing their data release. We now have an automated pipeline established with the USGS Publications Warehouse to collect the digital object identifier (DOI) for a publication given its IPDS number. The publication's DOI is then associated with the data release in two ways: it is added to ScienceBase on the data release landing page (see image) and is added to the DOI Tool in the data release DOI's metadata.

Screenshot of the External Related Resources section of a ScienceBase landing page
Screenshot of the External Related Resources section of a ScienceBase landing page

Tracking Data Citations 

One of the reasons that we are required to publicly release data is to increase scientific productivity by allowing others to reuse existing data. Data release authors may find it helpful to understand how others are using their data so that they can improve future data releases and to provide evidence of impact. The USGS Science Data Management (SDM) Branch has been working on an automated process to track these data citations using the eXtract Dark Data Database (xDD, formally known as GeoDeepDive, https://geodeepdive.org/). xDD is a tool that enables text mining of over 12 million and counting published research documents. The SDM team is using the xDD API to track references to USGS DOIs, that is, DOIs with the '10.5066' prefix. References to USGS DOIs are stored in the USGS DOI Tool database (see image). To see where a given USGS DOI has been referenced, go to the DOI Tool and search for the data release DOI. In the future, the ScienceBase team plans to display these citations on data release landing pages. Stay tuned for more information.

Screenshot of the DOI Tool's interface for seeing where an individual DOI has been cited
Screenshot of the DOI Tool's interface for seeing where an individual DOI has been cited

Featured Data Release

Map showing data from the 2015 Nepal landslide
Map showing data from the 2015 Nepal landslide

Map data of landslides triggered by the 25 April 2015 Mw 7.8 Gorkha, Nepal earthquake

Science Center: Geologic Hazards Science Center 

This data release, created in cooperation with partners at the University of Michigan, ETH, and in Nepal, mapped earthquake-triggered landslides using high-resolution (<1m pixel resolution) pre- and post-event satellite imagery. Since its publication in ScienceBase in 2017, the data have been cited by five publications, and the related primary publication (https://doi.org/10.1016/j.geomorph.2017.01.030) has been cited 68 times per Scopus and read 178 times on Mendeley. Additionally, the dataset helps underpin USGS models used to describe the extent and severity of landslides triggered by earthquakes. For example, one of the citing publications reused the data to propose a comprehensive method for near real-time landslide probability estimation using a logistic regression model based on slope units (Tanyas et al. 2019). Models like this are used operationally and provide situational awareness for earthquake response worldwide.  

Data citation and reuse, as shown in the example above, are only one way of measuring the impact of a data release. If you know of a data product available in ScienceBase that has gone on to be reused in other projects, inform policy decisions, garner attention in major media outlets, or any other interesting use, we'd love to hear about it. Please complete this form to contribute your data story. 

Tanyas, H., Rossi, M., Alvioli, M., van Westen, C.J. and Marchesini, I., 2019, A global slope unit-based method for the near real-time prediction of earthquake-induced landslides: Geomorphology, 327, pp.126-146, https://doi.org/10.1016/j.geomorph.2018.10.022.

Data citation: Roback, K., Clark, M.K., West, A.J., Zekkos, D., Li, G., Gallen, S.F., Champlain, D., and Godt, J.W., 2017, Map data of landslides triggered by the 25 April 2015 Mw 7.8 Gorkha, Nepal earthquake: U.S. Geological Survey data release, https://doi.org/10.5066/F7DZ06F9.

Image citation: Roback, K., Clark, M.K., West, A.J., Zekkos, D., Li, G., Gallen, S.F., Chamlagain, D. and Godt, J.W., 2018, The size, distribution, and mobility of landslides caused by the 2015 Mw7. 8 Gorkha earthquake, Nepal: Geomorphology, 301, pp.121-138, https://doi.org/10.1016/j.geomorph.2017.01.030.

Data Manager Resources: ScienceBase Scripts

The SBDR team recently shared a collection of ScienceBase data release scripts for use by USGS data managers. There are currently two Python scripts included in the collection: SBDR_Metrics and ScienceCenterRevisionCode. 

SBDR_Metrics provides users with basic metrics about ScienceBase data releases in general, as well as within a given time period. This script can return data on the number of public, in-progress, and revised data releases, data releases by mission area and science center, and all of the above by a given time period. Note: users will only be able to see in-progress data releases for which they have read permissions, as in-progress data releases are not yet public. Additionally, there’s a section for quality control that can run checks for missing fields, including mission area, science center and publication date, and can identify problems, such as incorrect publication date, invalid DOIs, and public data releases that still have an in-progress tag in place. 

The ScienceCenterRevisionCode notebook allows users to see all revisions completed in ScienceBase for their science center within a given time period. Both scripts are simple and well-documented, and only require beginner’s knowledge of Python and Jupyter Notebooks. Find these notebooks at the SBDR team’s code repository.

Did You Know?

It's possible to edit your display name in the ScienceBase directory. To edit, log in to the ScienceBase directory and select "people". Then search for your name in the system. Please edit only the "display name" field in your profile:  

Screenshot of the "edit person" interface of the ScienceBase directory
Screenshot of the "edit person" interface of the ScienceBase directory

Please note, the change will show up in data releases you publish in the future, not in previously published data releases. Users are only able to edit their own profile in the ScienceBase directory. Users can also contact the ScienceBase data release team to update their profile if they no longer have login access to ScienceBase.

Autofill IPDS Update

Do you find yourself entering information into the ScienceBase Data Release (SBDR) tool when you've already entered the same information into IPDS? A feature is now available in the SBDR tool to help streamline this process in ScienceBase for authors and data managers. Now, if a user enters an IPDS number associated with their data release and clicks on the autofill button, the information that has been entered into IPDS will auto-populate the SBDR tool form. Information including, but not limited to, authors and ORCIDs, title, and science center will be pulled into the SBDR tool. This feature will help reduce the amount of errors and the time allotted to input information.

Screenshot of the autofill from IPDS function in the SBDR Tool
Screenshot of the autofill from IPDS function in the SBDR Tool

Please note, we pull information back daily from IPDS at 5:20am CT, and are currently working to increase the number of times the information can be retrieved from IPDS.

Standardizing Publication Dates

To make a data release public, the SBDR team uses a Jupyter Notebook script that automatically runs through a set of steps that finalize the product. For example, the script moves a completed data release to a public folder, updates the read/write permissions, and adds the current date to the landing page as the publication date. One recent update to the Notebook is that it now overwrites the publication dates in attached .xml metadata files, ensuring that they match the publication date on the landing page.

Screenshot of a metadata record with an arrow pointing to the publication date
Screenshot of a metadata record with an arrow pointing to the publication date

This can save time for authors, especially if there are many child items with metadata records, or if authors don't know the publication date in advance. The publication date field in metadata records is displayed on USGS web pages via the Drupal content management system, so ensuring that metadata records have accurate, well-formatted publication dates can help standardize this field across systems.

Screenshot of a publication on a USGS Drupal website with an arrow pointing to the publication date
Screenshot of a publication on a USGS Drupal website with an arrow pointing to the publication date

Subscribe to the ScienceBase Mailing List for Quarterly Updates.