Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
Mol Phylogenet Evol ; 179: 107650, 2023 02.
Article in English | MEDLINE | ID: mdl-36441104

ABSTRACT

The effect of selection acting on regions of the genome on the accuracy of species-level phylogenetic inference using methods that do not explicitly model selection is an open question that is relevant to most, if not all, phylogenomic studies. To address this, we derive a mathematical approximation to the Wright-Fisher model with mutation and selection in the limit as the population size becomes large. In contrast to previous approximations based on diffusion processes, our approximation can be used to study the distribution of coalescent times for an arbitrary number of lineages, allowing calculation of the probability distribution of gene genealogies under the coalescent model. We use these calculations to show that direct selection at strengths typically encountered in practice has only a small effect on the distribution of coalescent times, and hence on the distribution of gene trees. This implies that many coalescent-based methods for estimating the species tree topology will be robust to the presence of selection in a subset of the underlying genes. Selection will, however, bias the estimation of speciation times, causing them to underestimate the true speciation times. Our model captures the effects of selection on the genealogies that generate the observed sequence data, but does not model selective pressures that act only on the subsequent sequences or that negatively impact gene tree estimation.


Subject(s)
Genetic Speciation , Models, Genetic , Phylogeny , Probability , Mutation
2.
Science ; 376(6589): 156-162, 2022 04 08.
Article in English | MEDLINE | ID: mdl-35389782

ABSTRACT

Whereas DNA viruses are known to be abundant, diverse, and commonly key ecosystem players, RNA viruses are insufficiently studied outside disease settings. In this study, we analyzed ≈28 terabases of Global Ocean RNA sequences to expand Earth's RNA virus catalogs and their taxonomy, investigate their evolutionary origins, and assess their marine biogeography from pole to pole. Using new approaches to optimize discovery and classification, we identified RNA viruses that necessitate substantive revisions of taxonomy (doubling phyla and adding >50% new classes) and evolutionary understanding. "Species"-rank abundance determination revealed that viruses of the new phyla "Taraviricota," a missing link in early RNA virus evolution, and "Arctiviricota" are widespread and dominant in the oceans. These efforts provide foundational knowledge critical to integrating RNA viruses into ecological and epidemiological models.


Subject(s)
Genome, Viral , RNA Viruses , Viruses , Biological Evolution , Ecosystem , Oceans and Seas , Phylogeny , RNA , RNA Viruses/genetics , Virome/genetics , Viruses/genetics
3.
Syst Biol ; 70(5): 891-907, 2021 08 11.
Article in English | MEDLINE | ID: mdl-33404632

ABSTRACT

Interspecific hybridization is an important evolutionary phenomenon that generates genetic variability in a population and fosters species diversity in nature. The availability of large genome scale data sets has revolutionized hybridization studies to shift from the observation of the presence or absence of hybrids to the investigation of the genomic constitution of hybrids and their genome-specific evolutionary dynamics. Although a handful of methods have been proposed in an attempt to identify hybrids, accurate detection of hybridization from genomic data remains a challenging task. In addition to methods that infer phylogenetic networks or that utilize pairwise divergence, site pattern frequency based and population genetic clustering approaches are popularly used in practice, though the performance of these methods under different hybridization scenarios has not been extensively examined. Here, we use simulated data to comparatively evaluate the performance of four tools that are commonly used to infer hybridization events: the site pattern frequency based methods HyDe and the $D$-statistic (i.e., the ABBA-BABA test) and the population clustering approaches structure and ADMIXTURE. We consider single hybridization scenarios that vary in the time of hybridization and the amount of incomplete lineage sorting (ILS) for different proportions of parental contributions ($\gamma$); introgressive hybridization; multiple hybridization scenarios; and a mixture of ancestral and recent hybridization scenarios. We focus on the statistical power to detect hybridization and the false discovery rate (FDR) for comparisons of the $D$-statistic and HyDe, and the accuracy of the estimates of $\gamma$ as measured by the mean squared error for HyDe, structure, and ADMIXTURE. Both HyDe and the $D$-statistic are powerful for detecting hybridization in all scenarios except those with high ILS, although the $D$-statistic often has an unacceptably high FDR. The estimates of $\gamma$ in HyDe are impressively robust and accurate whereas structure and ADMIXTURE sometimes fail to identify hybrids, particularly when the proportional parental contributions are asymmetric (i.e., when $\gamma$ is close to 0). Moreover, the posterior distribution estimated using structure exhibits multimodality in many scenarios, making interpretation difficult. Our results provide guidance in selecting appropriate methods for identifying hybrid populations from genomic data. [ABBA-BABA test; ADMIXTURE; hybridization; HyDe; introgression; Patterson's $D$-statistic; Structure.].


Subject(s)
Genome , Hybridization, Genetic , Genetics, Population , Genomics , Phylogeny
4.
BMC Evol Biol ; 19(1): 112, 2019 05 30.
Article in English | MEDLINE | ID: mdl-31146685

ABSTRACT

BACKGROUND: Coalescent-based species tree inference has become widely used in the analysis of genome-scale multilocus and SNP datasets when the goal is inference of a species-level phylogeny. However, numerous evolutionary processes are known to violate the assumptions of a coalescence-only model and complicate inference of the species tree. One such process is hybrid speciation, in which a species shares its ancestry with two distinct species. Although many methods have been proposed to detect hybrid speciation, only a few have considered both hybridization and coalescence in a unified framework, and these are generally limited to the setting in which putative hybrid species must be identified in advance. RESULTS: Here we propose a method that can examine genome-scale data for a large number of taxa and detect those taxa that may have arisen via hybridization, as well as their potential "parental" taxa. The method is based on a model that considers both coalescence and hybridization together, and uses phylogenetic invariants to construct a test that scales well in terms of computational time for both the number of taxa and the amount of sequence data. We test the method using simulated data for up 20 taxa and 100,000bp, and find that the method accurately identifies both recent and ancient hybrid species in less than 30 s. We apply the method to two empirical datasets, one composed of Sistrurus rattlesnakes for which hybrid speciation is not supported by previous work, and one consisting of several species of Heliconius butterflies for which some evidence of hybrid speciation has been previously found. CONCLUSIONS: The proposed method is powerful for detecting hybridization for both recent and ancient hybridization events. The computations required can be carried out rapidly for a large number of sequences using genome-scale data, and the method is appropriate for both SNP and multilocus data.


Subject(s)
Databases, Genetic , Genomics , Hybridization, Genetic , Models, Genetic , Animals , Butterflies/genetics , Computer Simulation , Crotalus/genetics , Genetic Speciation , Phylogeny , Species Specificity
5.
Stat Appl Genet Mol Biol ; 17(3)2018 06 06.
Article in English | MEDLINE | ID: mdl-29874197

ABSTRACT

The increasing availability of population-level allele frequency data across one or more related populations necessitates the development of methods that can efficiently estimate population genetics parameters, such as the strength of selection acting on the population(s), from such data. Existing methods for this problem in the setting of the Wright-Fisher diffusion model are primarily likelihood-based, and rely on numerical approximation for likelihood computation and on bootstrapping for assessment of variability in the resulting estimates, requiring extensive computation. Recent work has provided a method for obtaining exact samples from general Wright-Fisher diffusion processes, enabling the development of methods for Bayesian estimation in this setting. We develop and implement a Bayesian method for estimating the strength of selection based on the Wright-Fisher diffusion for data sampled at a single time point. The method utilizes the latest algorithms for exact sampling to devise a Markov chain Monte Carlo procedure to draw samples from the joint posterior distribution of the selection coefficient and the allele frequencies. We demonstrate that when assumptions about the initial allele frequencies are accurate the method performs well for both simulated data and for an empirical data set on hypoxia in flies, where we find evidence for strong positive selection in a region of chromosome 2L previously identified. We discuss possible extensions of our method to the more general settings commonly encountered in practice, highlighting the advantages of Bayesian approaches to inference in this setting.


Subject(s)
Bayes Theorem , Gene Frequency , Genetics, Population , Models, Genetic , Algorithms , Animals , Drosophila melanogaster/genetics , Hypoxia/genetics , Likelihood Functions , Markov Chains , Monte Carlo Method , Polymorphism, Single Nucleotide
6.
Syst Biol ; 67(5): 821-829, 2018 09 01.
Article in English | MEDLINE | ID: mdl-29562307

ABSTRACT

The analysis of hybridization and gene flow among closely related taxa is a common goal for researchers studying speciation and phylogeography. Many methods for hybridization detection use simple site pattern frequencies from observed genomic data and compare them to null models that predict an absence of gene flow. The theory underlying the detection of hybridization using these site pattern probabilities exploits the relationship between the coalescent process for gene trees within population trees and the process of mutation along the branches of the gene trees. For certain models, site patterns are predicted to occur in equal frequency (i.e., their difference is 0), producing a set of functions called phylogenetic invariants. In this article, we introduce HyDe, a software package for detecting hybridization using phylogenetic invariants arising under the coalescent model with hybridization. HyDe is written in Python and can be used interactively or through the command line using pre-packaged scripts. We demonstrate the use of HyDe on simulated data, as well as on two empirical data sets from the literature. We focus in particular on identifying individual hybrids within population samples and on distinguishing between hybrid speciation and gene flow. HyDe is freely available as an open source Python package under the GNU GPL v3 on both GitHub (https://github.com/pblischak/HyDe) and the Python Package Index (PyPI: https://pypi.python.org/pypi/phyde).


Subject(s)
Computational Biology/methods , Gene Flow , Genetic Speciation , Hybridization, Genetic , Software
7.
Bioinformatics ; 34(3): 407-415, 2018 02 01.
Article in English | MEDLINE | ID: mdl-29028881

ABSTRACT

Motivation: Genotyping and parameter estimation using high throughput sequencing data are everyday tasks for population geneticists, but methods developed for diploids are typically not applicable to polyploid taxa. This is due to their duplicated chromosomes, as well as the complex patterns of allelic exchange that often accompany whole genome duplication (WGD) events. For WGDs within a single lineage (autopolyploids), inbreeding can result from mixed mating and/or double reduction. For WGDs that involve hybridization (allopolyploids), alleles are typically inherited through independently segregating subgenomes. Results: We present two new models for estimating genotypes and population genetic parameters from genotype likelihoods for auto- and allopolyploids. We then use simulations to compare these models to existing approaches at varying depths of sequencing coverage and ploidy levels. These simulations show that our models typically have lower levels of estimation error for genotype and parameter estimates, especially when sequencing coverage is low. Finally, we also apply these models to two empirical datasets from the literature. Overall, we show that the use of genotype likelihoods to model non-standard inheritance patterns is a promising approach for conducting population genomic inferences in polyploids. Availability and implementation: A C ++ program, EBG, is provided to perform inference using the models we describe. It is available under the GNU GPLv3 on GitHub: https://github.com/pblischak/polyploid-genotyping. Contact: blischak.4@osu.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Genotyping Techniques/methods , Inbreeding , Polymorphism, Single Nucleotide , Polyploidy , Sequence Analysis, DNA/methods , Software , Alleles , Animals , Eukaryota/genetics , Genetics, Population/methods , High-Throughput Nucleotide Sequencing/methods
8.
Syst Biol ; 66(4): 620-636, 2017 07 01.
Article in English | MEDLINE | ID: mdl-28123114

ABSTRACT

Detecting variation in the evolutionary process along chromosomes is increasingly important as whole-genome data become more widely available. For example, factors such as incomplete lineage sorting, horizontal gene transfer, and chromosomal inversion are expected to result in changes in the underlying gene trees along a chromosome, while changes in selective pressure and mutational rates for different genomic regions may lead to shifts in the underlying mutational process. We propose the split score as a general method for quantifying support for a particular phylogenetic relationship within a genomic data set. Because the split score is based on algebraic properties of a matrix of site pattern frequencies, it can be rapidly computed, even for data sets that are large in the number of taxa and/or in the length of the alignment, providing an advantage over other methods (e.g., maximum likelihood) that are often used to assess such support. Using simulation, we explore the properties of the split score, including its dependence on sequence length, branch length, size of a split and its ability to detect true splits in the underlying tree. Using a sliding window analysis, we show that split scores can be used to detect changes in the underlying evolutionary process for genome-scale data from primates, mosquitoes, and viruses in a computationally efficient manner. Computation of the split score has been implemented in the software package SplitSup.


Subject(s)
Classification/methods , Phylogeny , Animals , Culicidae/classification , Culicidae/genetics , Evolution, Molecular , Gene Transfer, Horizontal , Genome/genetics , Primates/classification , Primates/genetics , Software , Viruses/classification , Viruses/genetics
9.
Mol Phylogenet Evol ; 106: 144-150, 2017 01.
Article in English | MEDLINE | ID: mdl-27693467

ABSTRACT

Although it is widely appreciated that gene trees may differ from the overall species tree and from one another due to various evolutionary processes (e.g., incomplete lineage sorting (ILS), horizontal gene transfer, etc.), the extent of this incongruence is rarely quantified and discussed. Here we consider the expected amount of incongruence arising from ILS, as modeled by the coalescent process. In particular, we compute the probability that two gene trees randomly sampled from the same species tree agree with one another as well as the distribution of the Robinson-Foulds distance between them, for species trees with three to eight taxa. We demonstrate that, as expected under the coalescent model, the amount of discordance is affected by species tree-specific factors such as speciation times and effective population sizes for the species under consideration. Our results highlight the fact that substantial discordance may occur, even when the number of species is very small, which has implications both for larger taxon samples and for any method that uses estimated gene trees as the basis for further statistical inference. The amount of incongruence is substantial enough that such methods may need to be modified to account for variability in the underlying gene trees.


Subject(s)
Models, Genetic , Genetic Loci , Genetic Speciation , Phylogeny , Recombination, Genetic
10.
Mol Phylogenet Evol ; 105: 177-192, 2016 12.
Article in English | MEDLINE | ID: mdl-27614251

ABSTRACT

We propose a coalescent model for three species that allows gene flow between both pairs of sister populations. The model is designed for multilocus genomic sequence alignments, with one sequence sampled from each of the three species, and is formulated using a Markov chain representation that allows use of matrix exponentiation to compute analytical expressions for the probability density of coalescent histories. The coalescent history distribution as well as the gene tree topology distribution under this coalescent model with gene flow are then calculated via numerical integration. We analyze the model to compare the distributions of gene tree topologies and coalescent histories for species trees with differing effective population sizes and gene flow rates. Our results suggest conditions under which the species tree and associated parameters are not identifiable from the gene tree topology distribution when gene flow is present, but indicate that the coalescent history distribution may identify the species tree and associated parameters. Thus, the coalescent history distribution can be used to infer parameters such as the ancestral effective population sizes and the rates of gene flow in a maximum likelihood (ML) framework. We conduct computer simulations to evaluate the performance of our method in estimating these parameters, and we apply our method to an Afrotropical mosquito data set (Fontaine et al., 2015).


Subject(s)
Gene Flow , Models, Genetic , Computer Simulation , Phylogeny , Probability
11.
Mol Ecol Resour ; 16(3): 742-54, 2016 May.
Article in English | MEDLINE | ID: mdl-26607217

ABSTRACT

Despite the increasing opportunity to collect large-scale data sets for population genomic analyses, the use of high-throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty-ADU), which complicates the calculation of important quantities such as allele frequencies. Here, we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high-throughput sequencing data in the form of read counts. We bridge the gap from data collection (using restriction enzyme based techniques [e.g. GBS, RADseq]) to allele frequency estimation in a unified inferential framework using a hierarchical Bayesian model to sum over genotype uncertainty. Simulated data sets were generated under various conditions for tetraploid, hexaploid and octoploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also provide an implementation of our model in the R package polyfreqs and demonstrate its use with two example analyses that investigate (i) levels of expected and observed heterozygosity and (ii) model adequacy. Our simulations show that the number of individuals sampled from a population has a greater impact on estimation error than sequencing coverage. The example analyses also show that our model and software can be used to make inferences beyond the estimation of allele frequencies for autopolyploids by providing assessments of model adequacy and estimates of heterozygosity.


Subject(s)
Biostatistics/methods , Gene Frequency , Genetics, Population/methods , Genotype , Polyploidy , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
12.
Math Biosci ; 268: 9-21, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26256054

ABSTRACT

One of the fundamental goals in phylogenetics is to make inferences about the evolutionary pattern among a group of individuals, such as genes or species, using present-day genetic material. This pattern is represented by a phylogenetic tree, and as computational methods have caught up to the statistical theory, Bayesian methods of making inferences about phylogenetic trees have become increasingly popular. Bayesian inference of phylogenetic trees requires sampling from intractable probability distributions. Common methods of sampling from these distributions include Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods, and one way that both of these methods can proceed is by first simulating a tree topology and then taking a sample from the posterior distribution of the branch lengths given the tree topology and the data set. In many MCMC methods, it is difficult to verify that the underlying Markov chain is geometrically ergodic, and thus, it is necessary to rely on output-based convergence diagnostics in order to assess convergence on an ad hoc basis. These diagnostics suffer from several important limitations, so in an effort to circumvent these limitations, this work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chain takes to converge.


Subject(s)
Bayes Theorem , Models, Theoretical , Phylogeny
13.
Ecol Evol ; 4(4): 462-73, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24634730

ABSTRACT

In surveys of hybrid zones, dominant genetic markers are often used to identify individuals of hybrid origin and assign these individuals to one of several potential hybrid classes. Quantitative analyses that address the statistical power of dominant markers in such inference are scarce. In this study, dominant genotype data were simulated to evaluate the effects of, first, the number of loci analyzed, second, the magnitude of differentiation between the markers scored in the groups that are hybridizing, and third, the level of genotyping error associated with the data when assigning individuals to various parental and hybrid categories. The overall performance of the assignment methods was relatively modest at the lowest level of divergence examined (F st ˜ 0.4), but improved substantially at higher levels of differentiation (F st ˜ 0.67 or 0.8). The effect of genotyping error was dependent on the level of divergence between parental taxa, with larger divergences tempering the effects of genotyping error. These results highlight the importance of considering the effects of each of the variables when assigning individuals to various parental and hybrid categories, and can help guide decisions regarding the number of loci employed in future hybridization studies to achieve the power and level of resolution desired.

14.
Mol Phylogenet Evol ; 70: 63-9, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24055603

ABSTRACT

Multi-locus phylogenetic inference is commonly carried out via models that incorporate the coalescent process to model the possibility that incomplete lineage sorting leads to incongruence between gene trees and the species tree. An interesting question that arises in this context is whether data "fit" the coalescent model. Previous work (Rosenfeld et al., 2012) has suggested that rooting of gene trees may account for variation in empirical data that has been previously attributed to the coalescent process. We examine this possibility using simulated data. We show that, in the case of four taxa, the distribution of gene trees observed from rooting estimated gene trees with either the molecular clock or with outgroup rooting can be closely matched by the distribution predicted by the coalescent model with specific choices of species tree branch lengths. We apply commonly-used coalescent-based methods of species tree inference to assess their performance in these situations.


Subject(s)
Phylogeny , Models, Genetic , Probability , Sequence Analysis, DNA
15.
Mol Phylogenet Evol ; 69(3): 1057-62, 2013 Dec.
Article in English | MEDLINE | ID: mdl-23769751

ABSTRACT

With recent advances in genomic sequencing, the importance of taking the effects of the processes that can cause discord between the speciation history and the individual gene histories into account has become evident. For multilocus datasets, it is difficult to achieve complete coverage of all sampled loci across all sample specimens, a problem that also arises when combining incompletely overlapping datasets. Here we examine how missing data affects the accuracy of species tree reconstruction. In our study, 10- and 100-locus sequence datasets were simulated under the coalescent model from shallow and deep speciation histories, and species trees were estimated using the maximum likelihood and Bayesian frameworks (with STEM and (*)BEAST, respectively). The accuracy of the estimated species trees was evaluated using the symmetric difference and the SPR distance. We examine the effects of sampling more than one individual per species, as well as the effects of different patterns of missing data (i.e., different amounts of missing data, which is represented among random taxa as opposed to being concentrated in specific taxa, as is often the case for empirical studies). Our general conclusion is that the species tree estimates are remarkably resilient to the effects of missing data. We find that for datasets with more limited numbers of loci, sampling more than one individual per species has the strongest effect on improving species tree accuracy when there is missing data, especially at higher degrees of missing data. For larger multilocus datasets (e.g., 25-100 loci), the amount of missing data has a negligible effect on species tree reconstruction, even at 50% missing data and a single sampled individual per species.


Subject(s)
Genetic Speciation , Models, Genetic , Phylogeny , Sequence Analysis, DNA/methods , Bayes Theorem , Computer Simulation , Likelihood Functions
16.
BMC Bioinformatics ; 14: 200, 2013 Jun 20.
Article in English | MEDLINE | ID: mdl-23786262

ABSTRACT

BACKGROUND: In mammalian genetics, many quantitative traits, such as blood pressure, are thought to be influenced by specific genes, but are also affected by environmental factors, making the associated genes difficult to identify and locate from genetic data alone. In particular, the application of classical statistical methods to single nucleotide polymorphism (SNP) data collected in genome-wide association studies has been especially challenging. We propose a coalescent approach to search for SNPs associated with quantitative traits in genome-wide association study (GWAS) data by taking into account the evolutionary history among SNPs. RESULTS: We evaluate the performance of the new method using simulated data, and find that it performs at least as well as existing methods with an increase in performance in the case of population structure. Application of the methodology to a real data set consisting of high-density lipoprotein cholesterol measurements in mice shows the method performs well for empirical data, as well. CONCLUSIONS: By combining methods from stochastic processes and phylogenetics, this work provides an innovative avenue for the development of new statistical methodology in the analysis of GWAS data.


Subject(s)
Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Animals , Cholesterol, HDL/blood , Mice , Phenotype
19.
Mol Phylogenet Evol ; 59(2): 354-63, 2011 May.
Article in English | MEDLINE | ID: mdl-21397706

ABSTRACT

Development of methods for estimating species trees from multilocus data is a current challenge in evolutionary biology. We propose a method for estimating the species tree topology and branch lengths using approximate Bayesian computation (ABC). The method takes as data a sample of observed rooted gene tree topologies, and then iterates through the following sequence of steps: First, a randomly selected species tree is used to compute the distribution of rooted gene tree topologies. This distribution is then compared to the observed gene topology frequencies, and if the fit between the observed and the predicted distributions is close enough, the proposed species tree is retained. Repeating this many times leads to a collection of retained species trees that are then used to form the estimate of the overall species tree. We test the performance of the method, which we call ST-ABC, using both simulated and empirical data. The simulation study examines both symmetric and asymmetric species trees over a range of branch lengths and sample sizes. The results from the simulation study show that the model performs very well, giving accurate estimates for both the topology and the branch lengths across the conditions studied, and that a sample size of 25 loci appears to be adequate for the method. Further, we apply the method to two empirical cases: a 4-taxon data set for primates and a 7-taxon data set for yeast. In both cases, we find that estimates obtained with ST-ABC agree with previous studies. The method provides efficient estimation of the species tree, and does not require sequence data, but rather the observed distribution of rooted gene topologies without branch lengths. Therefore, this method is a useful alternative to other currently available methods for species tree estimation.


Subject(s)
Algorithms , Bayes Theorem , Classification/methods , Models, Genetic , Phylogeny , Animals , Computer Simulation , Humans , Primates/genetics , Yeasts
20.
Syst Biol ; 60(4): 393-409, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21389297

ABSTRACT

Phylogenetic relationships and taxonomic distinctiveness of closely related species and subspecies are most accurately inferred from data derived from multiple independent loci. Here, we apply several approaches for understanding species-level relationships using data from 18 nuclear DNA loci and 1 mitochondrial DNA locus within currently described species and subspecies of Sistrurus rattlesnakes. Collectively, these methods provide evidence that a currently described species, the massasauga rattlesnake (Sistrurus catenatus), consists of two well-supported clades, one composed of the two western subspecies (S. c. tergeminus and S. c. edwardsii) and the other the eastern subspecies (S. c. catenatus). Within pigmy rattlesnakes (S. miliarius), however, there is not strong support across methods for any particular grouping at the subspecific level. Monophyly based tests for taxonomic distinctiveness show evidence for distinctiveness of all subspecies but this support is strongest by far for the S. c. catenatus clade. Because support for the distinctiveness of S. c. catenatus is both strong and consistent across methods, and due to its morphological distinctiveness and allopatric distribution, we suggest that this subspecies be elevated to full species status, which has significant conservation implications. Finally, most divergence time estimates based upon a fossil-calibrated species tree are > 50% younger than those from a concatenated gene tree analysis and suggest that an active period of speciation within Sistrurus occurred within the late Pliocene/Pleistocene eras.


Subject(s)
Crotalus/classification , Phylogeny , Animals , Crotalus/genetics , DNA/chemistry , DNA, Mitochondrial/chemistry , Genetic Speciation , Recombination, Genetic , Sequence Analysis, DNA , Species Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...