Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Bioinform Adv ; 3(1): vbad168, 2023.
Article in English | MEDLINE | ID: mdl-38046098

ABSTRACT

Summary: Quantifying genetic clusters (=populations) from genotypic data is a fundamental, but non-trivial task for population geneticists that is compounded by: hierarchical population structure, diverse analytical methods, and complex software dependencies. AdmixPipe v3 ameliorates many of these issues in a single bioinformatic pipeline that facilitates all facets of population structure analysis by integrating outputs generated by several popular packages (i.e. CLUMPAK, EvalAdmix). The pipeline interfaces disparate software packages to parse Admixture outputs and conduct EvalAdmix analyses in the context of multimodal population structure results identified by CLUMPAK. We further streamline these tasks by packaging AdmixPipe v3 within a Docker container to create a standardized analytical environment that allows for complex analyses to be replicated by different researchers. This also grants operating system flexibility and mitigates complex software dependencies. Availability and implementation: Source code, documentation, example files, and usage examples are freely available at https://github.com/stevemussmann/admixturePipeline. Installation is facilitated via Docker container available from https://hub.docker.com/r/mussmann/admixpipe. Usage under Windows operating systems requires the Windows Subsystem for Linux.

2.
Proc Biol Sci ; 290(1999): 20230768, 2023 05 31.
Article in English | MEDLINE | ID: mdl-37192670

ABSTRACT

Hybridization is a complicated, oft-misunderstood process. Once deemed unnatural and uncommon, hybridization is now recognized as ubiquitous among species. But hybridization rates within and among communities are poorly understood despite the relevance to ecology, evolution and conservation. To clarify, we examined hybridization across 75 freshwater fish communities within the Ozarks of the North American Interior Highlands (USA) by single nucleotide polymorphism (SNP) genotyping 33 species (N = 2865 individuals; double-digest restriction site-associated DNA sequencing (ddRAD)). We found evidence of hybridization (70 putative hybrids; 2.4% of individuals) among 18 species-pairs involving 73% (24/33) of study species, with the majority being concentrated within one family (Leuciscidae/minnows; 15 species; 66 hybrids). Interspecific genetic exchange-or introgression-was evident from 24 backcrossed individuals (10/18 species-pairs). Hybrids occurred within 42 of 75 communities (56%). Four selected environmental variables (species richness, protected area extent, precipitation (May and annually)) exhibited 73-78% accuracy in predicting hybrid occurrence via random forest classification. Our community-level assessment identified hybridization as spatially widespread and environmentally dependent (albeit predominantly within one diverse, omnipresent family). Our approach provides a more holistic survey of natural hybridization by testing a wide range of species-pairs, thus contrasting with more conventional evaluations.


Subject(s)
Hybridization, Genetic , Metagenomics , Animals , Sequence Analysis, DNA
3.
Mol Ecol ; 32(24): 6743-6765, 2023 Dec.
Article in English | MEDLINE | ID: mdl-36461662

ABSTRACT

Genetic differentiation among local groups of individuals, that is, genetic ß-diversity, is a key component of population persistence related to connectivity and isolation. However, most genetic investigations of natural populations focus on a single species, overlooking opportunities for multispecies conservation plans to benefit entire communities in an ecosystem. We present an approach to evaluate genetic ß-diversity within and among many species and demonstrate how this riverscape community genomics approach can be applied to identify common drivers of genetic structure. Our study evaluated genetic ß-diversity in 31 co-distributed native stream fishes sampled from 75 sites across the White River Basin (Ozarks, USA) using SNP genotyping (ddRAD). Despite variance among species in the degree of genetic divergence, general spatial patterns were identified corresponding to river network architecture. Most species (N = 24) were partitioned into discrete subpopulations (K = 2-7). We used partial redundancy analysis to compare species-specific genetic ß-diversity across four models of genetic structure: Isolation by distance (IBD), isolation by barrier (IBB), isolation by stream hierarchy (IBH), and isolation by environment (IBE). A significant proportion of intraspecific genetic variation was explained by IBH (x̄ = 62%), with the remaining models generally redundant. We found evidence for consistent spatial modularity in that gene flow is higher within rather than between hierarchical units (i.e., catchments, watersheds, basins), supporting the generalization of the stream hierarchy model. We discuss our conclusions regarding conservation and management and identify the 8-digit hydrologic unit (HUC) as the most relevant spatial scale for managing genetic diversity across riverine networks.


Subject(s)
Ecosystem , Genetics, Population , Humans , Genetic Variation/genetics , Metagenomics , Environment , Rivers
4.
PLoS One ; 16(12): e0260344, 2021.
Article in English | MEDLINE | ID: mdl-34882713

ABSTRACT

Ecological restoration can promote biodiversity conservation in anthropogenically fragmented habitats, but effectiveness of these management efforts need to be statistically validated to determine 'success.' One such approach is to gauge the extent of recolonization as a measure of landscape permeability and, in turn, population connectivity. In this context, we estimated dispersal and population connectivity in prairie vole (Microtus ochrogaster; N = 231) and meadow vole (M. pennsylvanicus; N = 83) within five tall-grass prairie restoration sites embedded within the agricultural matrix of midwestern North America. We predicted that vole dispersal would be constrained by the extent of agricultural land surrounding restored habitat patches, spatially isolating vole populations and resulting in significant genetic structure. We first employed genetic assignment tests based on 15 microsatellite DNA loci to validate field-derived species-designations, then tested reclassified samples with multivariate and Bayesian clustering to assay for spatial and temporal genetic structure. Population connectivity was further evaluated by calculating pairwise FST, then potential demographic effects explored by computing migration rates, effective population size (Ne), and average relatedness (r). Genetic species assignments reclassified 25% of initial field identifications (N = 11 M. ochrogaster; N = 67 M. pennsylvanicus). In M. ochrogaster population connectivity was high across the study area, reflected in little to no spatial or temporal genetic structure. In M. pennsylvanicus genetic structure was detected, but relatedness estimates identified it as kin-clustering instead, underscoring social behavior among populations rather than spatial isolation as the cause. Estimates of Ne and r were stable across years, reflecting high dispersal and demographic resilience. Combined, these metrics suggest the agricultural matrix is highly permeable for voles and does not impede dispersal. High connectivity observed confirms that the restored landscape is productive and permeable for specific management targets such as voles and also demonstrates population genetic assays as a tool to statistically evaluate effectiveness of conservation initiatives.


Subject(s)
Arvicolinae/classification , Arvicolinae/physiology , Microsatellite Repeats , Animals , Arvicolinae/genetics , Bayes Theorem , Environmental Restoration and Remediation , Female , Gene Flow , Genetic Variation , Genetics, Population , Grassland , North America , Population Density , Population Dynamics
5.
R Soc Open Sci ; 8(10): 210727, 2021 Oct.
Article in English | MEDLINE | ID: mdl-34729207

ABSTRACT

The recurrence of similar evolutionary patterns within different habitats often reflects parallel selective pressures acting upon either standing or independently occurring genetic variation to produce a convergence of phenotypes. This interpretation (i.e. parallel divergences within adjacent streams) has been hypothesized for drainage-specific morphological 'ecotypes' observed in polyploid snowtrout (Cyprinidae: Schizothorax). However, parallel patterns of differential introgression during secondary contact are a viable alternative hypothesis. Here, we used ddRADseq (N = 35 319 de novo and N = 10 884 transcriptome-aligned SNPs), as derived from Nepali/Bhutanese samples (N = 48 each), to test these competing hypotheses. We first employed genome-wide allelic depths to derive appropriate ploidy models, then a Bayesian approach to yield genotypes statistically consistent under the inferred expectations. Elevational 'ecotypes' were consistent in geometric morphometric space, but with phylogenetic relationships at the drainage level, sustaining a hypothesis of independent emergence. However, partitioned analyses of phylogeny and admixture identified subsets of loci under selection that retained genealogical concordance with morphology, suggesting instead that apparent patterns of morphological/phylogenetic discordance are driven by widespread genomic homogenization. Here, admixture occurring in secondary contact effectively 'masks' previous isolation. Our results underscore two salient factors: (i) morphological adaptations are retained despite hybridization and (ii) the degree of admixture varies across tributaries, presumably concomitant with underlying environmental or anthropogenic factors.

6.
BMC Bioinformatics ; 22(1): 501, 2021 Oct 16.
Article in English | MEDLINE | ID: mdl-34656096

ABSTRACT

BACKGROUND: Patterns of multi-locus differentiation (i.e., genomic clines) often extend broadly across hybrid zones and their quantification can help diagnose how species boundaries are shaped by adaptive processes, both intrinsic and extrinsic. In this sense, the transitioning of loci across admixed individuals can be contrasted as a function of the genome-wide trend, in turn allowing an expansion of clinal theory across a much wider array of biodiversity. However, computational tools that serve to interpret and consequently visualize 'genomic clines' are limited, and users must often write custom, relatively complex code to do so. RESULTS: Here, we introduce the ClineHelpR R-package for visualizing genomic clines and detecting outlier loci using output generated by two popular software packages, bgc and Introgress. ClineHelpR bundles both input generation (i.e., filtering datasets and creating specialized file formats) and output processing (e.g., MCMC thinning and burn-in) with functions that directly facilitate interpretation and hypothesis testing. Tools are also provided for post-hoc analyses that interface with external packages such as ENMeval and RIdeogram. CONCLUSIONS: Our package increases the reproducibility and accessibility of genomic cline methods, thus allowing an expanded user base and promoting these methods as mechanisms to address diverse evolutionary questions in both model and non-model organisms. Furthermore, the ClineHelpR extended functionality can evaluate genomic clines in the context of spatial and environmental features, allowing users to explore underlying processes potentially contributing to the observed patterns and helping facilitate effective conservation management strategies.


Subject(s)
Genome , Hybridization, Genetic , Biological Evolution , Genomics , Humans , Reproducibility of Results
7.
Genome Biol Evol ; 13(9)2021 09 01.
Article in English | MEDLINE | ID: mdl-34432005

ABSTRACT

Species are indisputable units for biodiversity conservation, yet their delimitation is fraught with both conceptual and methodological difficulties. A classic example is the taxonomic controversy surrounding the Gila robusta complex in the lower Colorado River of southwestern North America. Nominal species designations were originally defined according to weakly diagnostic morphological differences, but these conflicted with subsequent genetic analyses. Given this ambiguity, the complex was re-defined as a single polytypic unit, with the proposed "threatened" status under the U.S. Endangered Species Act of two elements being withdrawn. Here we re-evaluated the status of the complex by utilizing dense spatial and genomic sampling (n = 387 and >22 k loci), coupled with SNP-based coalescent and polymorphism-aware phylogenetic models. In doing so, we found that all three species were indeed supported as evolutionarily independent lineages, despite widespread phylogenetic discordance. To juxtapose this discrepancy with previous studies, we first categorized those evolutionary mechanisms driving discordance, then tested (and subsequently rejected) prior hypotheses which argued phylogenetic discord in the complex was driven by the hybrid origin of Gila nigra. The inconsistent patterns of diversity we found within G. robusta were instead associated with rapid Plio-Pleistocene drainage evolution, with subsequent divergence within the "anomaly zone" of tree space producing ambiguities that served to confound prior studies. Our results not only support the resurrection of the three species as distinct entities but also offer an empirical example of how phylogenetic discordance can be categorized within other recalcitrant taxa, particularly when variation is primarily partitioned at the species level.


Subject(s)
Cyprinidae , Rivers , Animals , Colorado , Cyprinidae/genetics , Phylogeny , Uncertainty
8.
Evol Appl ; 14(6): 1673-1689, 2021 Jun.
Article in English | MEDLINE | ID: mdl-34178112

ABSTRACT

Approximately 100 years ago, unregulated harvest nearly eliminated white-tailed deer (Odocoileus virginianus) from eastern North America, which subsequently served to catalyze wildlife management as a national priority. An extensive stock-replenishment effort soon followed, with deer broadly translocated among states as a means of re-establishment. However, an unintended consequence was that natural patterns of gene flow became obscured and pretranslocation signatures of population structure were replaced. We applied cutting-edge molecular and biogeographic tools to disentangle genetic signatures of historical management from those reflecting spatially heterogeneous dispersal by evaluating 35,099 single nucleotide polymorphisms (SNPs) derived via reduced-representation genomic sequencing from 1143 deer sampled statewide in Arkansas. We then employed Simpson's diversity index to summarize ancestry assignments and visualize spatial genetic transitions. Using sub-sampled transects across these transitions, we tested clinal patterns across loci against theoretical expectations of their response under scenarios of re-colonization and restricted dispersal. Two salient results emerged: (A) Genetic signatures from historic translocations are demonstrably apparent; and (B) Geographic filters (major rivers; urban centers; highways) now act as inflection points for the distribution of this contemporary ancestry. These results yielded a statewide assessment of contemporary population structure in deer as driven by historic translocations as well as ongoing processes. In addition, the analytical framework employed herein to effectively decipher extant/historic drivers of deer distribution in Arkansas is also applicable for other biodiversity elements with similarly complex demographic histories.

9.
Mol Ecol Resour ; 21(8): 2801-2817, 2021 Nov.
Article in English | MEDLINE | ID: mdl-33566450

ABSTRACT

Model-based approaches that attempt to delimit species are hampered by computational limitations as well as the unfortunate tendency by users to disregard algorithmic assumptions. Alternatives are clearly needed, and machine-learning (M-L) is attractive in this regard as it functions without the need to explicitly define a species concept. Unfortunately, its performance will vary according to which (of several) bioinformatic parameters are invoked. Herein, we gauge the effectiveness of M-L-based species-delimitation algorithms by parsing 64 variably-filtered versions of a ddRAD-derived SNP data set collected from North American box turtles (Terrapene spp.). Our filtering strategies included: (i) minor allele frequencies (MAF) of 5%, 3%, 1%, and 0% (= none), and (ii) maximum missing data per-individual/per-population at 25%, 50%, 75%, and 100% (= no filtering). We found that species-delimitation via unsupervised M-L impacted the signal-to-noise ratio in our data, as well as the discordance among resolved clades. The latter may also reflect biogeographic history, gene flow, incomplete lineage sorting, or combinations thereof (as corroborated from previously observed patterns of differential introgression). Our results substantiate M-L as a viable species-delimitation method, but also demonstrate how commonly observed patterns of phylogenetic discordance can seriously impact M-L-classification.


Subject(s)
Turtles , Animals , Gene Flow , Machine Learning , North America , Phylogeny , Turtles/genetics
10.
Prion ; 14(1): 238-248, 2020 12.
Article in English | MEDLINE | ID: mdl-33078661

ABSTRACT

Chronic-wasting disease (CWD) is a prion-derived fatal neurodegenerative disease that has affected wild cervid populations on a global scale. Susceptibility has been linked unambiguously to several amino acid variants within the prion protein gene (PRNP). Quantifying their distribution across landscapes can provide critical information for agencies attempting to adaptively manage CWD. Here we attempt to further define management implications of PRNP polymorphism by quantifying the contemporary geographic distribution (i.e., phylogeography) of PRNP variants in hunter-harvested white-tailed deer (WTD; Odocoileus virginianus, N = 1433) distributed across Arkansas (USA), including a focal spot for CWD since detection of the disease in February 2016. Of these, PRNP variants associated with the well-characterized 96S non-synonymous substitution showed a significant increase in relative frequency among older CWD-positive cohorts. We interpreted this pattern as reflective of a longer life expectancy for 96S genotypes in a CWD-endemic region, suggesting either decreased probabilities of infection or reduced disease progression. Other variants showing statistical signatures of potential increased susceptibility, however, seemingly reflect an artefact of population structure. We also showed marked heterogeneity across the landscape in the prevalence of 'reduced susceptibility' genotypes. This may indicate, in turn, that differences in disease susceptibility among WTD in Arkansas are an innate, population-level characteristic that is detectable through phylogeographic analysis.


Subject(s)
Aging/genetics , Aging/pathology , Deer/genetics , Polymorphism, Genetic , Prion Proteins/genetics , Animals , Female , Gene Frequency/genetics , Geography , Haplotypes/genetics , Odds Ratio
11.
Mol Ecol ; 29(21): 4186-4202, 2020 11.
Article in English | MEDLINE | ID: mdl-32882754

ABSTRACT

Hybridization occurs differentially across the genome in a balancing act between selection and migration. With the unprecedented resolution of contemporary sequencing technologies, selection and migration can now be effectively quantified such that researchers can identify genetic elements involved in introgression. Furthermore, genomic patterns can now be associated with ecologically relevant phenotypes, given availability of annotated reference genomes. We do so in North American box turtles (Terrapene) by deciphering how selection affects hybrid zones at the interface of species boundaries and identifying genetic regions potentially under selection that may relate to thermal adaptations. Such genes may impact physiological pathways involved in temperature-dependent sex determination, immune system functioning and hypoxia tolerance. We contrasted these patterns across inter- and intraspecific hybrid zones that differ temporally and biogeographically. We demonstrate hybridization is broadly apparent in Terrapene, but with observed genomic cline patterns corresponding to species boundaries at loci potentially associated with thermal adaptation. These loci display signatures of directional introgression within intraspecific boundaries, despite a genome-wide selective trend against intergrades. In contrast, outlier loci for interspecific comparisons exhibited evidence of being under selection against hybrids. Importantly, adaptations coinciding with species boundaries in Terrapene overlap with climatic boundaries and highlight the vulnerability of these terrestrial ectotherms to anthropogenic pressures.


Subject(s)
Turtles , Animals , Genome , Genomics , Hybridization, Genetic , Phenotype , Turtles/genetics , United States
12.
Ecol Evol ; 10(13): 6477-6493, 2020 Jul.
Article in English | MEDLINE | ID: mdl-32724527

ABSTRACT

The delimitation of species boundaries, particularly those obscured by reticulation, is a critical step in contemporary biodiversity assessment. It is especially relevant for conservation and management of indigenous fishes in western North America, represented herein by two species with dissimilar life histories codistributed in the highly modified Colorado River (i.e., flannelmouth sucker, Catostomus latipinnis; bluehead sucker, C. (Pantosteus) discobolus). To quantify phylogenomic patterns and examine proposed taxonomic revisions, we first employed double-digest restriction site-associated DNA sequencing (ddRAD), yielding 39,755 unlinked SNPs across 139 samples. These were subsequently evaluated with multiple analytical approaches and by contrasting life history data. Three phylogenetic methods and a Bayesian assignment test highlighted similar phylogenomic patterns in each, but with considerable difference in presumed times of divergence. Three lineages were detected in bluehead sucker, supporting elevation of C. (P.) virescens to species status and recognizing C. (P.) discobolus yarrowi (Zuni bluehead sucker) as a discrete entity. Admixture in the latter necessitated a reevaluation of its contemporary and historic distributions, underscoring how biodiversity identification can be confounded by complex evolutionary histories. In addition, we defined three separate flannelmouth sucker lineages as ESUs (evolutionarily significant units), given limited phenotypic and genetic differentiation, contemporary isolation, and lack of concordance (per the genealogical concordance component of the phylogenetic species concept). Introgression was diagnosed in both species, with the Little Colorado and Virgin rivers in particular. Our diagnostic methods, and the agreement of our SNPs with previous morphological, enzymatic, and mitochondrial work, allowed us to partition complex evolutionary histories into requisite components, such as isolation versus secondary contact.

13.
BMC Bioinformatics ; 21(1): 337, 2020 Jul 29.
Article in English | MEDLINE | ID: mdl-32727359

ABSTRACT

BACKGROUND: Research on the molecular ecology of non-model organisms, while previously constrained, has now been greatly facilitated by the advent of reduced-representation sequencing protocols. However, tools that allow these large datasets to be efficiently parsed are often lacking, or if indeed available, then limited by the necessity of a comparable reference genome as an adjunct. This, of course, can be difficult when working with non-model organisms. Fortunately, pipelines are currently available that avoid this prerequisite, thus allowing data to be a priori parsed. An oft-used molecular ecology program (i.e., STRUCTURE), for example, is facilitated by such pipelines, yet they are surprisingly absent for a second program that is similarly popular and computationally more efficient (i.e., ADMIXTURE). The two programs differ in that ADMIXTURE employs a maximum-likelihood framework whereas STRUCTURE uses a Bayesian approach, yet both produce similar results. Given these issues, there is an overriding (and recognized) need among researchers in molecular ecology for bioinformatic software that will not only condense output from replicated ADMIXTURE runs, but also infer from these data the optimal number of population clusters (K). RESULTS: Here we provide such a program (i.e., ADMIXPIPE) that (a) filters SNPs to allow the delineation of population structure in ADMIXTURE, then (b) parses the output for summarization and graphical representation via CLUMPAK. Our benchmarks effectively demonstrate how efficient the pipeline is for processing large, non-model datasets generated via double digest restriction-site associated DNA sequencing (ddRAD). Outputs not only parallel those from STRUCTURE, but also visualize the variation among individual ADMIXTURE runs, so as to facilitate selection of the most appropriate K-value. CONCLUSIONS: ADMIXPIPE successfully integrates ADMIXTURE analysis with popular variant call format (VCF) filtering software to yield file types readily analyzed by CLUMPAK. Large population genomic datasets derived from non-model organisms are efficiently analyzed via the parallel-processing capabilities of ADMIXTURE. ADMIXPIPE is distributed under the GNU Public License and freely available for Mac OSX and Linux platforms at: https://github.com/stevemussmann/admixturePipeline .


Subject(s)
Models, Biological , Software , Bayes Theorem , Computational Biology , Genome , Polymorphism, Single Nucleotide
14.
Heredity (Edinb) ; 123(6): 759-773, 2019 12.
Article in English | MEDLINE | ID: mdl-31431737

ABSTRACT

Many species have evolved or currently coexist in sympatry due to differential adaptation in a heterogeneous environment. However, anthropogenic habitat modifications can either disrupt reproductive barriers or obscure environmental conditions which underlie fitness gradients. In this study, we evaluated the potential for an anthropogenically-mediated shift in reproductive boundaries that separate two historically sympatric fish species (Gila cypha and G. robusta) endemic to the Colorado River Basin using ddRAD sequencing of 368 individuals. We first examined the integrity of reproductive isolation while in sympatry and allopatry, then characterized hybrid ancestries using genealogical assignment tests. We tested for localized erosion of reproductive isolation by comparing site-wise genomic clines against global patterns and identified a breakdown in the drainage-wide pattern of selection against interspecific heterozygotes. This, in turn, allowed for the formation of a hybrid swarm in one tributary, and asymmetric introgression where species co-occur. We also detected a weak but significant relationship between genetic purity and degree of consumptive water removal, suggesting a role for anthropogenic habitat modifications in undermining species boundaries or expanding historically limited introgression. In addition, results from basin-wide genomic clines suggested that hybrids and parental forms are adaptively nonequivalent. If so, then a failure to manage for hybridization will exacerbate the long-term extinction risk in parental populations. These results reinforce the role of anthropogenic habitat modification in promoting interspecific introgression in sympatric species by relaxing divergent selection. This, in turn, underscores a broader role for hybridization in decreasing global biodiversity within rapidly deteriorating environments.


Subject(s)
Cyprinidae/genetics , Genetic Speciation , Genetics, Population , Animals , Ecosystem , Gene Flow/genetics , Genome/genetics , Genomics , Humans , Hybridization, Genetic , North America , Reproductive Isolation , Sympatry/genetics
15.
Bioinformatics ; 34(24): 4293-4296, 2018 12 15.
Article in English | MEDLINE | ID: mdl-29961853

ABSTRACT

Motivation: It is a non-trivial task to identify and design capture probes ('baits') for the diverse array of targeted-enrichment methods now available (e.g. ultra-conserved elements, anchored hybrid enrichment, RAD-capture). This often involves parsing large genomic alignments, followed by multiple steps of curating candidate genomic regions to optimize targeted information content (e.g. genetic variation) and to minimize potential probe dimerization and non-target enrichment. Results: In this context, we developed MrBait, a user-friendly, generalized software pipeline for identification, design and optimization of targeted-enrichment probes across a range of target-capture paradigms. MrBait is an open-source codebase that leverages native parallelization capabilities in Python and mitigates memory usage via a relational-database back-end. Numerous filtering methods allow comprehensive optimization of designed probes, including built-in functionality that employs BLAST, similarity-based clustering and a graph-based algorithm that 'rescues' failed probes. Availability and implementation: Complete code for MrBait is available on GitHub (https://github.com/tkchafin/mrbait), and is also available with all dependencies via one-line installation using the conda package manager. Online documentation describing installation and runtime instructions can be found at: https://mrbait.readthedocs.io. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Probes , Genomics , Software , Algorithms , Computational Biology
SELECTION OF CITATIONS
SEARCH DETAIL
...