Pesquisa | Portal Regional da BVS

Molecular dating for phylogenies containing a mix of populations and species by using Bayesian and RelTime approaches.

Mello, Beatriz; Tao, Qiqing; Barba-Montoya, Jose; Kumar, Sudhir.

Mol Ecol Resour ; 21(1): 122-136, 2021 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-32881388

RESUMO

Simultaneous molecular dating of population and species divergences is essential in many biological investigations, including phylogeography, phylodynamics and species delimitation studies. In these investigations, multiple sequence alignments consist of both intra- and interspecies samples (mixed samples). As a result, the phylogenetic trees contain interspecies, interpopulation and within-population divergences. Bayesian relaxed clock methods are often employed in these analyses, but they assume the same tree prior for both inter- and intraspecies branching processes and require specification of a clock model for branch rates (independent vs. autocorrelated rates models). We evaluated the impact of a single tree prior on Bayesian divergence time estimates by analysing computer-simulated data sets. We also examined the effect of the assumption of independence of evolutionary rate variation among branches when the branch rates are autocorrelated. Bayesian approach with coalescent tree priors generally produced excellent molecular dates and highest posterior densities with high coverage probabilities. We also evaluated the performance of a non-Bayesian method, RelTime, which does not require the specification of a tree prior or a clock model. RelTime's performance was similar to that of the Bayesian approach, suggesting that it is also suitable to analyse data sets containing both populations and species variation when its computational efficiency is needed.

Assuntos

Evolução Molecular , Mamíferos , Modelos Genéticos , Filogenia , Animais , Teorema de Bayes , Simulação por Computador , Reprodutibilidade dos Testes

An evolutionary portrait of the progenitor SARS-CoV-2 and its dominant offshoots in COVID-19 pandemic.

Kumar, Sudhir; Tao, Qiqing; Weaver, Steven; Sanderford, Maxwell; Caraballo-Ortiz, Marcos A; Sharma, Sudip; Pond, Sergei L K; Miura, Sayaka.

bioRxiv ; 2021 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-32995781

RESUMO

We report the likely most recent common ancestor of SARS-CoV-2 - the coronavirus that causes COVID-19. This progenitor SARS-CoV-2 genome was recovered through a novel application and advancement of computational methods initially developed to reconstruct the mutational history of tumor cells in a patient. The progenitor differs from the earliest coronaviruses sampled in China by three variants, implying that none of the earliest patients represent the index case or gave rise to all the human infections. However, multiple coronavirus infections in China and the USA harbored the progenitor genetic fingerprint in January 2020 and later, suggesting that the progenitor was spreading worldwide as soon as weeks after the first reported cases of COVID-19. Mutations of the progenitor and its offshoots have produced many dominant coronavirus strains, which have spread episodically over time. Fingerprinting based on common mutations reveals that the same coronavirus lineage has dominated North America for most of the pandemic. There have been multiple replacements of predominant coronavirus strains in Europe and Asia and the continued presence of multiple high-frequency strains in Asia and North America. We provide a continually updating dashboard of global evolution and spatiotemporal trends of SARS-CoV-2 spread (http://sars2evo.datamonkey.org/).

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics.

Tao, Qiqing; Barba-Montoya, Jose; Huuki, Louise A; Durnan, Mary Kathleen; Kumar, Sudhir.

Mol Biol Evol ; 37(6): 1819-1831, 2020 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-32119075

RESUMO

The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes.

Assuntos

Evolução Molecular , Genômica/métodos , Modelos Genéticos , Filogenia , Plantas/genética

Reliable Confidence Intervals for RelTime Estimates of Evolutionary Divergence Times.

Tao, Qiqing; Tamura, Koichiro; Mello, Beatriz; Kumar, Sudhir.

Mol Biol Evol ; 37(1): 280-290, 2020 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-31638157

RESUMO

Confidence intervals (CIs) depict the statistical uncertainty surrounding evolutionary divergence time estimates. They capture variance contributed by the finite number of sequences and sites used in the alignment, deviations of evolutionary rates from a strict molecular clock in a phylogeny, and uncertainty associated with clock calibrations. Reliable tests of biological hypotheses demand reliable CIs. However, current non-Bayesian methods may produce unreliable CIs because they do not incorporate rate variation among lineages and interactions among clock calibrations properly. Here, we present a new analytical method to calculate CIs of divergence times estimated using the RelTime method, along with an approach to utilize multiple calibration uncertainty densities in dating analyses. Empirical data analyses showed that the new methods produce CIs that overlap with Bayesian highest posterior density intervals. In the analysis of computer-simulated data, we found that RelTime CIs show excellent average coverage probabilities, that is, the actual time is contained within the CIs with a 94% probability. These developments will encourage broader use of computationally efficient RelTime approaches in molecular dating analyses and biological hypothesis testing.

Assuntos

Evolução Molecular , Técnicas Genéticas , Animais , Intervalos de Confiança , Humanos

A Machine Learning Method for Detecting Autocorrelation of Evolutionary Rates in Large Phylogenies.

Tao, Qiqing; Tamura, Koichiro; U Battistuzzi, Fabia; Kumar, Sudhir.

Mol Biol Evol ; 36(4): 811-824, 2019 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-30689923

RESUMO

New species arise from pre-existing species and inherit similar genomes and environments. This predicts greater similarity of the tempo of molecular evolution between direct ancestors and descendants, resulting in autocorrelation of evolutionary rates in the tree of life. Surprisingly, molecular sequence data have not confirmed this expectation, possibly because available methods lack the power to detect autocorrelated rates. Here, we present a machine learning method, CorrTest, to detect the presence of rate autocorrelation in large phylogenies. CorrTest is computationally efficient and performs better than the available state-of-the-art method. Application of CorrTest reveals extensive rate autocorrelation in DNA and amino acid sequence evolution of mammals, birds, insects, metazoans, plants, fungi, parasitic protozoans, and prokaryotes. Therefore, rate autocorrelation is a common phenomenon throughout the tree of life. These findings suggest concordance between molecular and nonmolecular evolutionary patterns, and they will foster unbiased and precise dating of the tree of life.

Assuntos

Evolução Biológica , Técnicas Genéticas , Modelos Genéticos , Aprendizado de Máquina , Fatores de Tempo

Fast and Accurate Estimates of Divergence Times from Big Data.

Mello, Beatriz; Tao, Qiqing; Tamura, Koichiro; Kumar, Sudhir.

Mol Biol Evol ; 34(1): 45-50, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-27836983

RESUMO

Ongoing advances in sequencing technology have led to an explosive expansion in the molecular data available for building increasingly larger and more comprehensive timetrees. However, Bayesian relaxed-clock approaches frequently used to infer these timetrees impose a large computational burden and discourage critical assessment of the robustness of inferred times to model assumptions, influence of calibrations, and selection of optimal data subsets. We analyzed eight large, recently published, empirical datasets to compare time estimates produced by RelTime (a non-Bayesian method) with those reported by using Bayesian approaches. We find that RelTime estimates are very similar to Bayesian approaches, yet RelTime requires orders of magnitude less computational time. This means that the use of RelTime will enable greater rigor in molecular dating, because faster computational speeds encourage more extensive testing of the robustness of inferred timetrees to prior assumptions (models and calibrations) and data subsets. Thus, RelTime provides a reliable and computationally thrifty approach for dating the tree of life using large-scale molecular datasets.

Assuntos

Evolução Biológica , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Variação Genética , Animais , Teorema de Bayes , Aves/genética , Simulação por Computador , Conjuntos de Dados como Assunto , Evolução Molecular , Especiação Genética , Mamíferos/genética , Modelos Genéticos , Taxa de Mutação , Filogenia , Aranhas/genética

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA