Skip to main content
U.S. flag

An official website of the United States government

Taxonomic composition of environmental DNA acquired by filtration from the St. Regis River, New York

November 23, 2020

Environmental DNA (eDNA) surveys have become important tools for monitoring aquatic biodiversity. Barcode sequencing of eDNA generates community profiles that, while potentially biased in both capture and amplification, can nonetheless yield high information content per unit cost. While factors affecting eDNA capture and amplification have been heavily studied, watershed-scale assessments of fish communities and our confidence in such have been less frequent. We performed an initial watershed-scale characterization of fish eDNA using rapid, low-volume filtering with replicate and control samples scaled for a single Illumina MiSeq flow cell, using the mitochondrial 12S ribosomal RNA locus for taxonomic profiling. Our bioinformatic approach included 1) direct estimation of sequencing error from unambiguous mappings (alignments) and simulation of error in taxonomic assignment under various mapping criteria; 2) binning of species based on inferred assignment error rather than by taxonomic rank; and 3) visualization of mismatch distributions to facilitate discovery of distinct haplotypes attributed to the same reference. Our approach was implemented for the St. Regis River, New York, United States, which supports a valuable recreational fishery and has been a target of restoration activities. We used a large record of St. Regis-specific observations to validate our assignments. We found that 300 mL drawn through 25-mm filters yielded greater than 5 ng/uL DNA at most sites in August and September, which was an approximate threshold for generating strong sequencing libraries in our hands. Using inferred sequence error rates, we binned 12S references for 110 species on a state-level checklist into 85 single-species bins and seven multispecies bins. Of 48 taxonomic bins actually observed in the St. Regis, we detected eDNA consistent with 40, with an additional four detections flagged as potential contaminants post-collection. Sixteen unobserved species detected by eDNA ranged from plausible to implausible based on distributional data, whereas six observed species had no 12S reference sequence.