Genetic analysis is increasingly used to understand ecosystem processes and inform conservation, management, and policy. I assist USGS researchers and their collaborators in the design, analysis, and interpretation of high-throughput genetic studies. Common applications include: detecting genes responsive to particular environmental stressors in a sentinel species or species of conservation concern; generating reference genome sequences of pathogens for functional or evolutionary analysis; identifying genetic variation that distinguish populations or species; using “barcode” sequences to identify species in gut contents, feces, or environmental samples.
Transcriptomics
Various RNAs are transcribed from the genome, including the mRNAs that encode proteins. Isolating the mRNA from a tissue sample and sequencing it with high-throughput technology allows the underlying “coding sequence” of the genome to be reconstructed and compared to databases of known sequences to infer potential functions. It also allows the statistical analysis of differential gene expression between two or more sample types, such as experimentally manipulated cohorts and the corresponding control cohort. Such an experiment may establish the physiological relevance of a potential stressor, or identify candidate biomarkers that provide early warning of those stressors before irreparable harm is done to a sensitive population.
Population Genomics
The genomes of most organisms are very similar within a given species, with increasing levels of divergence accruing at higher taxonomic ranks reflecting longer divergence times. However, most genomes contain millions to billions of bases of DNA, so that even if only 1 in 100,000 bases varies within a species, individuals may still differ from conspecifics at thousands of genomic positions. High-throughput sequencing facilitates the identification of these variable sites and estimation of the frequencies of the different variants. This information can be used to construct well-supported species phylogenies, estimate gene flow or hybridization events, and detect non-neutral patterns of evolution. This type of information is critical for effective implementation of conservation measures.
Metagenomics
Metagenomics is the analysis of genomic fragments in complex mixtures that are inherently difficult to separate by species, such as microbial communities or a host tissue infected with a virus. High-throughput sequencing and computational approaches are used to identify the microbial taxa present, the genes they harbor (such as toxins or antibiotics), and the complex interaction between pathogen and host that might lead to disease, for example.
Barcode Sequencing
Genetic “barcodes” are unique signatures that reveal the species from which detected DNA derives. Barcode sequencing complements metagenomics, differing in that only certain small regions that are most informative of biological origin are examined, rather than the entire “metagenome” of the sample. Barcode methods can be used to reconstruct the diet of animal species noninvasively from feces, quantify the biodiversity of minute larvae dispersing in water, or detect invasive plant species with a pollen trap.
Below are publications associated with this project.
The bee microbiome: Impact on bee health and model for evolution and ecology of host-microbe interactions
Taxonomic characterization of honey bee (Apis mellifera) pollen foraging based on non-overlapping paired-end sequencing of nuclear ribosomal loci
Identifying plant taxa that honey bees (Apis mellifera) forage upon is of great apicultural interest, but traditional methods are labor intensive and may lack resolution. Here we evaluate a high-throughput genetic barcoding approach to characterize trap-collected pollen from multiple North Dakota apiaries across multiple years. We used the Illumina MiSeq platform to generate sequence scaffolds fro
Z chromosome divergence, polymorphism and relative effective population size in a genus of lekking birds
Characterization of a novel hepadnavirus in the white sucker (Catostomus commersonii) from the Great Lakes Region of the USA
Genomic single-nucleotide polymorphisms confirm that Gunnison and Greater sage-grouse are genetically well differentiated and that the Bi-State population is distinct
Transcriptome resources for the frogs Lithobates clamitans and Pseudacris regilla, emphasizing antimicrobial peptides and conserved loci for phylogenetics
- Overview
Genetic analysis is increasingly used to understand ecosystem processes and inform conservation, management, and policy. I assist USGS researchers and their collaborators in the design, analysis, and interpretation of high-throughput genetic studies. Common applications include: detecting genes responsive to particular environmental stressors in a sentinel species or species of conservation concern; generating reference genome sequences of pathogens for functional or evolutionary analysis; identifying genetic variation that distinguish populations or species; using “barcode” sequences to identify species in gut contents, feces, or environmental samples.
Genes that are significantly differentially expressed (red dots) in Eastern Elliptio mussels (Elliptio complanata) exposed to hypersalinity. USGS image. Transcriptomics
Various RNAs are transcribed from the genome, including the mRNAs that encode proteins. Isolating the mRNA from a tissue sample and sequencing it with high-throughput technology allows the underlying “coding sequence” of the genome to be reconstructed and compared to databases of known sequences to infer potential functions. It also allows the statistical analysis of differential gene expression between two or more sample types, such as experimentally manipulated cohorts and the corresponding control cohort. Such an experiment may establish the physiological relevance of a potential stressor, or identify candidate biomarkers that provide early warning of those stressors before irreparable harm is done to a sensitive population.
Identification of genetically distinct groups of sage grouse (Centrocercus sp.) using thousands of genomic sequence tags. USGS image. Population Genomics
The genomes of most organisms are very similar within a given species, with increasing levels of divergence accruing at higher taxonomic ranks reflecting longer divergence times. However, most genomes contain millions to billions of bases of DNA, so that even if only 1 in 100,000 bases varies within a species, individuals may still differ from conspecifics at thousands of genomic positions. High-throughput sequencing facilitates the identification of these variable sites and estimation of the frequencies of the different variants. This information can be used to construct well-supported species phylogenies, estimate gene flow or hybridization events, and detect non-neutral patterns of evolution. This type of information is critical for effective implementation of conservation measures.
Phylogenetic reconstruction of 11 isolates of Chelonid herpesvirus 5, the putative cause of fibropapillomatosis in sea turtles. USGS image. Metagenomics
Metagenomics is the analysis of genomic fragments in complex mixtures that are inherently difficult to separate by species, such as microbial communities or a host tissue infected with a virus. High-throughput sequencing and computational approaches are used to identify the microbial taxa present, the genes they harbor (such as toxins or antibiotics), and the complex interaction between pathogen and host that might lead to disease, for example.
Barcode Sequencing
Genetic “barcodes” are unique signatures that reveal the species from which detected DNA derives. Barcode sequencing complements metagenomics, differing in that only certain small regions that are most informative of biological origin are examined, rather than the entire “metagenome” of the sample. Barcode methods can be used to reconstruct the diet of animal species noninvasively from feces, quantify the biodiversity of minute larvae dispersing in water, or detect invasive plant species with a pollen trap.
Expected taxonomic recovery of known walrus (Odobenus rosemarus) prey items from four different genetic markers. USGS image. - Publications
Below are publications associated with this project.
The bee microbiome: Impact on bee health and model for evolution and ecology of host-microbe interactions
As pollinators, bees are cornerstones for terrestrial ecosystem stability and key components in agricultural productivity. All animals, including bees, are associated with a diverse community of microbes, commonly referred to as the microbiome. The bee microbiome is likely to be a crucial factor affecting host health. However, with the exception of a few pathogens, the impacts of most members of tAuthorsPhilipp Engel, Waldan K. Kwong, Quinn McFrederick, Kirk E. Anderson, Seth Michael Barribeau, James Angus Chandler, Robert S. Cornman, Jacques Dainat, Joachim R. de Miranda, Vincent Doublet, Olivier Emery, Jay D. Evans, Laurent Farinelli, Michelle L. Flenniken, Fredrik Granberg, Juris A. Grasis, Laurent Gauthier, Juliette Hayer, Hauke Koch, Sarah Kocher, Vincent G. Martinson, Nancy Moran, Monica Munoz-Torres, Irene Newton, Robert J. Paxton, Eli Powell, Ben M. Sadd, Paul Schmid-Hempel, Regula Schmid-Hempel, Se Jin Song, Ryan S. Schwarz, Dennis vanEngelsdorp, Benjamin DainatTaxonomic characterization of honey bee (Apis mellifera) pollen foraging based on non-overlapping paired-end sequencing of nuclear ribosomal loci
Identifying plant taxa that honey bees (Apis mellifera) forage upon is of great apicultural interest, but traditional methods are labor intensive and may lack resolution. Here we evaluate a high-throughput genetic barcoding approach to characterize trap-collected pollen from multiple North Dakota apiaries across multiple years. We used the Illumina MiSeq platform to generate sequence scaffolds fro
AuthorsRobert S. Cornman, Clint R.V. Otto, Deborah D. Iwanowicz, Jeffery S PettisZ chromosome divergence, polymorphism and relative effective population size in a genus of lekking birds
Sex chromosomes contribute disproportionately to species boundaries as they diverge faster than autosomes and often have reduced diversity. Their hemizygous nature contributes to faster divergence and reduced diversity, as do some types of selection. In birds, other factors (mating system and bottlenecks) can further decrease the effective population size of Z-linked loci and accelerate divergenceAuthorsSara J. Oyler-McCance, Robert S. Cornman, Kenneth L. Jones, Jennifer A. FikeCharacterization of a novel hepadnavirus in the white sucker (Catostomus commersonii) from the Great Lakes Region of the USA
The white sucker Catostomus commersonii is a freshwater teleost often utilized as a resident sentinel. Here, we sequenced the full genome of a hepatitis B-like virus that infects white suckers from the Great Lakes Region of the USA. Dideoxysequencing confirmed the white sucker hepatitis B virus (WSHBV) has a circular genome (3542 bp) with the prototypical codon organization of hepadnaviruses. ElecAuthorsCassidy M. Hahn, Luke R. Iwanowicz, Robert S. Cornman, Carla M. Conway, James R. Winton, Vicki S. BlazerGenomic single-nucleotide polymorphisms confirm that Gunnison and Greater sage-grouse are genetically well differentiated and that the Bi-State population is distinct
Sage-grouse are iconic, declining inhabitants of sagebrush habitats in western North America, and their management depends on an understanding of genetic variation across the landscape. Two distinct species of sage-grouse have been recognized, Greater (Centrocercus urophasianus) and Gunnison sage-grouse (C. minimus), based on morphology, behavior, and variation at neutral genetic markers. A parapaAuthorsSara J. Oyler-McCance, Robert S. Cornman, Kenneth L. Jones, Jennifer A. FikeTranscriptome resources for the frogs Lithobates clamitans and Pseudacris regilla, emphasizing antimicrobial peptides and conserved loci for phylogenetics
We developed genetic resources for two North American frogs, Lithobates clamitans and Pseudacris regilla, widespread native amphibians that are potential indicator species of environmental health. For both species, mRNA from multiple tissues was sequenced using 454 technology. De novo assemblies with Mira3 resulted in 50 238 contigs (N50 = 687 bp) and 48 213 contigs (N50 = 686 bp) for L. clamitansAuthorsLaura S. Robertson, Robert S. Cornman