Knowledge Extraction Algorithms (KEA): Turning Literature Into Data
Identifying, extracting, and mobilizing information from current and historical literature is a time-consuming part of organizing and collating synthetic data productions. This project explored the use of algorithm-based methods to identify and extract occurrence information from the GeoDeepDive (GDD) literature database to support upkeep of the Nonindigenous Aquatic Species (NAS) data. The GeoDeepDive API was extended to include query capabilities on terms from the Integrated Taxonomic Information System (ITIS). This functionality helped support identification of literature mentioning/focusing on species that are tracked by the Nonindigenous Aquatic Species Database. These methods were paired with algorithms to extract location information associated with term mentions. Efforts are in progress to continue improving these algorithms and workflow.
Principal Investigator : Matthew E Neilson
Co-Investigator : Daniel J Wieferich, Wesley M Daniel, Brandon S Serna
Cooperator/Partner : Shanan Peters
- Source: USGS Sciencebase (id: 5acd2680e4b0e2c2dd155dfd)
Identifying, extracting, and mobilizing information from current and historical literature is a time-consuming part of organizing and collating synthetic data productions. This project explored the use of algorithm-based methods to identify and extract occurrence information from the GeoDeepDive (GDD) literature database to support upkeep of the Nonindigenous Aquatic Species (NAS) data. The GeoDeepDive API was extended to include query capabilities on terms from the Integrated Taxonomic Information System (ITIS). This functionality helped support identification of literature mentioning/focusing on species that are tracked by the Nonindigenous Aquatic Species Database. These methods were paired with algorithms to extract location information associated with term mentions. Efforts are in progress to continue improving these algorithms and workflow.
Principal Investigator : Matthew E Neilson
Co-Investigator : Daniel J Wieferich, Wesley M Daniel, Brandon S Serna
Cooperator/Partner : Shanan Peters
- Source: USGS Sciencebase (id: 5acd2680e4b0e2c2dd155dfd)