Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Trends Microbiol ; 25(1): 11-18, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27773523

RESUMO

Finding a signature of purifying selection in a gene is usually interpreted as evidence for the gene providing a function that is targeted by natural selection. This opinion offers a very different hypothesis: purifying selection may be due to removing harmful mutations from the population, that is, the gene and its encoded protein become harmful after a mutation occurred, possibly because the mutated protein interferes with the translation machinery, or because of toxicity of the misfolded protein. Finding a signature of purifying selection should not automatically be considered proof of the gene's selectable function.


Assuntos
Sequência Conservada/genética , Escherichia coli/genética , Evolução Molecular , Variação Genética/genética , Modelos Genéticos , Salmonella enterica/genética , Sequência de Bases , Mutação/genética , Filogenia , Seleção Genética/genética
2.
Mol Phylogenet Evol ; 107: 338-344, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-27894995

RESUMO

Long Branch Attraction (LBA) is a well-known artifact in phylogenetic reconstruction when dealing with branch length heterogeneity. Here we show another phenomenon, Short Branch Attraction (SBA), which occurs when BLAST searches, a phenetic analysis, are used as a surrogate method for phylogenetic analysis. This error also results from branch length heterogeneity, but this time it is the short branches that are attracting. The SBA artifact is reciprocal and can be returned 100% of the time when multiple branches differ in length by a factor of more than two. SBA is an intended feature of BLAST searches, but becomes an issue, when top scoring BLAST hit analyses are used to infer Horizontal Gene Transfers (HGTs), assign taxonomic category with environmental sequence data in phylotyping, or gather homologous sequences for building gene families. SBA can lead researchers to believe that there has been a HGT event when only vertical descent has occurred, cause slowly evolving taxa to be over-represented and quickly evolving taxa to be under-represented in phylotyping, or systematically exclude quickly evolving taxa from analyses. SBA also contributes to the changing results of top scoring BLAST hit analyses as the database grows, because more slowly evolving taxa, or short branches, are added over time, introducing more potential for SBA. SBA can be detected by examining reciprocal best BLAST hits among a larger group of taxa, including the known closest phylogenetic neighbors. Therefore, one should look for this phenomenon when conducting best BLAST hit analyses as a surrogate method to identify HGTs, in phylotyping, or when using BLAST to gather homologous sequences.


Assuntos
Artefatos , Filogenia , Alinhamento de Sequência/métodos , Fatores de Tempo
3.
Bioinformatics ; 29(5): 571-9, 2013 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23335015

RESUMO

MOTIVATION: Horizontal gene transfer (HGT) plays a crucial role in the evolution of prokaryotic species. Typically, no more than a few genes are horizontally transferred between any two species. However, several studies identified pairs of species (or linages) between which many different genes were horizontally transferred. Such a pair is said to be linked by a highway of gene sharing. Inferring such highways is crucial to understanding the evolution of prokaryotes and for inferring past symbiotic and ecological associations among different species. RESULTS: We present a new improved method for systematically detecting highways of gene sharing. As we demonstrate using a variety of simulated datasets, our method is highly accurate and efficient, and robust to noise and high rates of HGT. We further validate our method by applying it to a published dataset of >22 000 gene trees from 144 prokaryotic species. Our method makes it practical, for the first time, to perform accurate highway analysis quickly and easily even on large datasets with high rates of HGT. AVAILABILITY AND IMPLEMENTATION: An implementation of the method can be freely downloaded from: http://acgt.cs.tau.ac.il/hide.


Assuntos
Algoritmos , Transferência Genética Horizontal , Genes Bacterianos , Filogenia , Bactérias/classificação , Bactérias/genética , Evolução Molecular
4.
BMC Evol Biol ; 12: 85, 2012 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-22694720

RESUMO

BACKGROUND: Horizontal gene transfer (HGT) has greatly impacted the genealogical history of many lineages, particularly for prokaryotes, with genes frequently moving in and out of a line of descent. Many genes that were acquired by a lineage in the past likely originated from ancestral relatives that have since gone extinct. During the course of evolution, HGT has played an essential role in the origin and dissemination of genetic and metabolic novelty. RESULTS: Three divergent forms of leucyl-tRNA synthetase (LeuRS) exist in the archaeal order Halobacteriales, commonly known as haloarchaea. Few haloarchaeal genomes have the typical archaeal form of this enzyme and phylogenetic analysis indicates it clusters within the Euryarchaeota as expected. The majority of sequenced halobacterial genomes possess a bacterial form of LeuRS. Phylogenetic reconstruction puts this larger group of haloarchaea at the base of the bacterial domain. The most parsimonious explanation is that an ancient transfer of LeuRS took place from an organism related to the ancestor of the bacterial domain to the haloarchaea. The bacterial form of LeuRS further underwent gene duplications and/or gene transfers within the haloarchaea, with some genomes possessing two distinct types of bacterial LeuRS. The cognate tRNALeu also reveals two distinct clusters for the haloarchaea; however, these tRNALeu clusters do not coincide with the groupings found in the LeuRS tree, revealing that LeuRS evolved independently of its cognate tRNA. CONCLUSIONS: The study of leucyl-tRNA synthetase in haloarchaea illustrates the importance of gene transfer originating in lineages that went extinct since the transfer occurred. The haloarchaeal LeuRS and tRNALeu did not co-evolve.


Assuntos
Evolução Molecular , Transferência Genética Horizontal , Halobacteriales/classificação , Leucina-tRNA Ligase/genética , Filogenia , Proteínas Arqueais/genética , DNA Arqueal/genética , Genoma Arqueal , Halobacteriales/enzimologia , Halobacteriales/genética , Funções Verossimilhança , Tipagem de Sequências Multilocus , RNA de Transferência de Leucina/genética
5.
Syst Biol ; 55(4): 553-65, 2006 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16857650

RESUMO

Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, delta and epsilon, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a "metachain" to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely.


Assuntos
Classificação/métodos , Interpretação Estatística de Dados , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo , Filogenia , Teorema de Bayes , Simulação por Computador
6.
Trends Microbiol ; 14(1): 4-8, 2006 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-16356716

RESUMO

Non-tree-based ("surrogate") methods have been used to identify instances of lateral genetic transfer in microbial genomes but agreement among predictions of different methods can be poor. It has been proposed that this disagreement arises because different surrogate methods are biased towards the detection of certain types of transfer events. This conjecture is supported by a rigorous phylogenetic analysis of 3776 proteins in Escherichia coli K12 MG1655 to map the ages of transfer events relative to one another.


Assuntos
Escherichia coli K12/genética , Transferência Genética Horizontal/genética , Genoma Bacteriano/genética , Composição de Bases , Biologia Computacional , Proteínas de Escherichia coli/genética , Cadeias de Markov , Filogenia
7.
Proc Natl Acad Sci U S A ; 102(40): 14332-7, 2005 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-16176988

RESUMO

The extent to which lateral genetic transfer has shaped microbial genomes has major implications for the emergence of community structures. We have performed a rigorous phylogenetic analysis of >220,000 proteins from genomes of 144 prokaryotes to determine the contribution of gene sharing to current prokaryotic diversity, and to identify "highways" of sharing between lineages. The inferred relationships suggest a pattern of inheritance that is largely vertical, but with notable exceptions among closely related taxa, and among distantly related organisms that live in similar environments.


Assuntos
Evolução Molecular , Transferência Genética Horizontal/genética , Variação Genética , Genoma Arqueal/genética , Genoma Bacteriano/genética , Filogenia , Células Procarióticas , Proteínas de Bactérias/genética , Sequência de Bases , Teorema de Bayes , Biologia Computacional/métodos , Modelos Genéticos , Alinhamento de Sequência
8.
BMC Evol Biol ; 5: 8, 2005 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-15676079

RESUMO

BACKGROUND: Bayesian phylogenetic inference holds promise as an alternative to maximum likelihood, particularly for large molecular-sequence data sets. We have investigated the performance of Bayesian inference with empirical and simulated protein-sequence data under conditions of relative branch-length differences and model violation. RESULTS: With empirical protein-sequence data, Bayesian posterior probabilities provide more-generous estimates of subtree reliability than does the nonparametric bootstrap combined with maximum likelihood inference, reaching 100% posterior probability at bootstrap proportions around 80%. With simulated 7-taxon protein-sequence datasets, Bayesian posterior probabilities are somewhat more generous than bootstrap proportions, but do not saturate. Compared with likelihood, Bayesian phylogenetic inference can be as or more robust to relative branch-length differences for datasets of this size, particularly when among-sites rate variation is modeled using a gamma distribution. When the (known) correct model was used to infer trees, Bayesian inference recovered the (known) correct tree in 100% of instances in which one or two branches were up to 20-fold longer than the others. At ratios more extreme than 20-fold, topological accuracy of reconstruction degraded only slowly when only one branch was of relatively greater length, but more rapidly when there were two such branches. Under an incorrect model of sequence change, inaccurate trees were sometimes observed at less extreme branch-length ratios, and (particularly for trees with single long branches) such trees tended to be more inaccurate. The effect of model violation on accuracy of reconstruction for trees with two long branches was more variable, but gamma-corrected Bayesian inference nonetheless yielded more-accurate trees than did either maximum likelihood or uncorrected Bayesian inference across the range of conditions we examined. Assuming an exponential Bayesian prior on branch lengths did not improve, and under certain extreme conditions significantly diminished, performance. The two topology-comparison metrics we employed, edit distance and Robinson-Foulds symmetric distance, yielded different but highly complementary measures of performance. CONCLUSIONS: Our results demonstrate that Bayesian inference can be relatively robust against biologically reasonable levels of relative branch-length differences and model violation, and thus may provide a promising alternative to maximum likelihood for inference of phylogenetic trees from protein-sequence data.


Assuntos
Teorema de Bayes , Biologia Computacional/métodos , Evolução Molecular , Funções Verossimilhança , Filogenia , Simulação por Computador , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Modelos Genéticos , Modelos Estatísticos , Modelos Teóricos , Probabilidade , Proteômica/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
9.
BMC Bioinformatics ; 5: 45, 2004 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-15115543

RESUMO

BACKGROUND: Grouping proteins into sequence-based clusters is a fundamental step in many bioinformatic analyses (e.g., homology-based prediction of structure or function). Standard clustering methods such as single-linkage clustering capture a history of cluster topologies as a function of threshold, but in practice their usefulness is limited because unrelated sequences join clusters before biologically meaningful families are fully constituted, e.g. as the result of matches to so-called promiscuous domains. Use of the Markov Cluster algorithm avoids this non-specificity, but does not preserve topological or threshold information about protein families. RESULTS: We describe a hybrid approach to sequence-based clustering of proteins that combines the advantages of standard and Markov clustering. We have implemented this hybrid approach over a relational database environment, and describe its application to clustering a large subset of PDB, and to 328577 proteins from 114 fully sequenced microbial genomes. To demonstrate utility with difficult problems, we show that hybrid clustering allows us to constitute the paralogous family of ATP synthase F1 rotary motor subunits into a single, biologically interpretable hierarchical grouping that was not accessible using either single-linkage or Markov clustering alone. We describe validation of this method by hybrid clustering of PDB and mapping SCOP families and domains onto the resulting clusters. CONCLUSION: Hybrid (Markov followed by single-linkage) clustering combines the advantages of the Markov Cluster algorithm (avoidance of non-specific clusters resulting from matches to promiscuous domains) and single-linkage clustering (preservation of topological information as a function of threshold). Within the individual Markov clusters, single-linkage clustering is a more-precise instrument, discerning sub-clusters of biological relevance. Our hybrid approach thus provides a computationally efficient approach to the automated recognition of protein families for phylogenomic analysis.


Assuntos
Proteínas de Bactérias/classificação , Biologia Computacional/estatística & dados numéricos , Genoma Bacteriano , Algoritmos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Análise por Conglomerados , Bases de Dados de Proteínas , ATPases Mitocondriais Próton-Translocadoras/química , ATPases Mitocondriais Próton-Translocadoras/classificação , Alinhamento de Sequência/métodos , Alinhamento de Sequência/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...