Understanding how data are used across the scientific community provides many benefits to data authors, including building a better awareness and comprehension of 1) a dataset's scientific impact, 2) use cases to direct future versions, and 3) related efforts. Effectively tracking when and how data are used in the literature through time can be challenging. This is in part due to a lack of consistency in how data are referenced in scientific publications and whether or not publishers index data citations [@Green:2009;@Corti et al. 2019]. The Make Data Count initiative (https://makedatacount.org) is encouraging publishers to implement standard data citation policies, including requiring that authors cite data in their references list using digital object identifiers (DOIs) or other globally unique and persistent identifiers, and indexing these data citations with Crossref. This initiative will enhance the ability of data authors to track downstream use of their data. Many publishers are adopting these practices [@Cousijn:2018]; however, data citation indexing may not happen retroactively.
Publink provides methods to extract and build relationships between publications and the datasets they reference. Methods are included to support pipelines for tracking existing and new relationships through time. The package currently leverages two different sources of information: 1) the eXtract Dark Data (xDD) digital library and machine reading system (formally known as GeoDeepDive, https://geodeepdive.org/), and 2) Crossref/DataCite Event Data (https://www.eventdata.crossref.org/guide/).
|Authors||Daniel J Wieferich, Brandon S Serna, Madison L Langseth, Drew A Ignizio, United States Geological Survey, Science Analytics and Synthesis|
|Product Type||Software Release|
|Record Source||USGS Digital Object Identifier Catalog|
|USGS Organization||Science Analytics and Synthesis|