Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Mol Biol Evol ; 25(2): 254-64, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18025066

RESUMO

Understanding the dynamics behind domain architecture evolution is of great importance to unravel the functions of proteins. Complex architectures have been created throughout evolution by rearrangement and duplication events. An interesting question is how many times a particular architecture has been created, a form of convergent evolution or domain architecture reinvention. Previous studies have approached this issue by comparing architectures found in different species. We wanted to achieve a finer-grained analysis by reconstructing protein architectures on complete domain trees. The prevalence of domain architecture reinvention in 96 genomes was investigated with a novel domain tree-based method that uses maximum parsimony for inferring ancestral protein architectures. Domain architectures were taken from Pfam. To ensure robustness, we applied the method to bootstrap trees and only considered results with strong statistical support. We detected multiple origins for 12.4% of the scored architectures. In a much smaller data set, the subset of completely domain-assigned proteins, the figure was 5.6%. These results indicate that domain architecture reinvention is a much more common phenomenon than previously thought. We also determined which domains are most frequent in multiply created architectures and assessed whether specific functions could be attributed to them. However, no strong functional bias was found in architectures with multiple origins.


Assuntos
Algoritmos , Evolução Molecular , Estrutura Terciária de Proteína , Biologia Computacional , Software
2.
Bioinformatics ; 23(24): 3382-3, 2007 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-17977882

RESUMO

UNLABELLED: PfamAlyzer is a Java applet that enables exploration of Pfam domain architectures using a user-friendly graphical interface. It can search the UniProt protein database for a domain pattern. Domain patterns similar to the query are presented graphically by PfamAlyzer either in a ranked list or pinned to the tree of life. Such domain-centric homology search can assist identification of distant homologs with shared domain architecture. AVAILABILITY: PfamAlyzer has been integrated with the Pfam database and can be accessed at http://pfam.cgb.ki.se/pfamalyzer.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados de Proteínas , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Interface Usuário-Computador , Armazenamento e Recuperação da Informação/métodos , Linguagens de Programação , Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos
3.
Nucleic Acids Res ; 34(Database issue): D247-51, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381856

RESUMO

Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the database and associated tools are also presented. Pfam is available on the web in the UK (http://www.sanger.ac.uk/Software/Pfam/), the USA (http://pfam.wustl.edu/), France (http://pfam.jouy.inra.fr/) and Sweden (http://pfam.cgb.ki.se/).


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Gráficos por Computador , Internet , Cadeias de Markov , Estrutura Terciária de Proteína , Proteínas/química , Alinhamento de Sequência , Software , Interface Usuário-Computador
4.
Mol Biol Evol ; 22(11): 2257-64, 2005 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16049194

RESUMO

Distance-based methods are popular for reconstructing evolutionary trees of protein sequences, mainly because of their speed and generality. A number of variants of the classical neighbor-joining (NJ) algorithm have been proposed, as well as a number of methods to estimate protein distances. We here present a large-scale assessment of performance in reconstructing the correct tree topology for the most popular algorithms. The programs BIONJ, FastME, Weighbor, and standard NJ were run using 12 distance estimators, producing 48 tree-building/distance estimation method combinations. These were evaluated on a test set based on real trees taken from 100 Pfam families. Each tree was used to generate multiple sequence alignments with the ROSE program using three evolutionary models. The accuracy of each method was analyzed as a function of both sequence divergence and location in the tree. We found that BIONJ produced the overall best results, although the average accuracy differed little between the tree-building methods (normally less than 1%). A noticeable trend was that FastME performed poorer than the rest on long branches. Weighbor was several orders of magnitude slower than the other programs. Larger differences were observed when using different distance estimators. Protein-adapted Jukes-Cantor and Kimura distance correction produced clearly poorer results than the other methods, even worse than uncorrected distances. We also assessed the recently developed Scoredist measure, which performed equally well as more complex methods.


Assuntos
Classificação/métodos , Evolução Molecular , Modelos Genéticos , Filogenia , Proteínas/genética , Sequência de Bases , Análise por Conglomerados , Simulação por Computador , Estudos de Avaliação como Assunto , Alinhamento de Sequência
5.
BMC Bioinformatics ; 6: 108, 2005 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-15857510

RESUMO

BACKGROUND: Distance-based methods are popular for reconstructing evolutionary trees thanks to their speed and generality. A number of methods exist for estimating distances from sequence alignments, which often involves some sort of correction for multiple substitutions. The problem is to accurately estimate the number of true substitutions given an observed alignment. So far, the most accurate protein distance estimators have looked for the optimal matrix in a series of transition probability matrices, e.g. the Dayhoff series. The evolutionary distance between two aligned sequences is here estimated as the evolutionary distance of the optimal matrix. The optimal matrix can be found either by an iterative search for the Maximum Likelihood matrix, or by integration to find the Expected Distance. As a consequence, these methods are more complex to implement and computationally heavier than correction-based methods. Another problem is that the result may vary substantially depending on the evolutionary model used for the matrices. An ideal distance estimator should produce consistent and accurate distances independent of the evolutionary model used. RESULTS: We propose a correction-based protein sequence estimator called Scoredist. It uses a logarithmic correction of observed divergence based on the alignment score according to the BLOSUM62 score matrix. We evaluated Scoredist and a number of optimal matrix methods using three evolutionary models for both training and testing Dayhoff, Jones-Taylor-Thornton, and Muller-Vingron, as well as Whelan and Goldman solely for testing. Test alignments with known distances between 0.01 and 2 substitutions per position (1-200 PAM) were simulated using ROSE. Scoredist proved as accurate as the optimal matrix methods, yet substantially more robust. When trained on one model but tested on another one, Scoredist was nearly always more accurate. The Jukes-Cantor and Kimura correction methods were also tested, but were substantially less accurate. CONCLUSION: The Scoredist distance estimator is fast to implement and run, and combines robustness with accuracy. Scoredist has been incorporated into the Belvu alignment viewer, which is available at ftp://ftp.cgb.ki.se/pub/prog/belvu/.


Assuntos
Biologia Computacional/métodos , Análise de Sequência de Proteína/métodos , Software , Algoritmos , Sequência de Aminoácidos , Evolução Biológica , Calibragem , Simulação por Computador , Evolução Molecular , Funções Verossimilhança , Modelos Estatísticos , Dados de Sequência Molecular , Método de Monte Carlo , Reconhecimento Automatizado de Padrão , Filogenia , Homologia de Sequência de Aminoácidos
6.
Biotechniques ; 37(2): 282-4, 2004 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-15335221

RESUMO

On the basis of shotgun subclone libraries used in the sequencing of the Drosophila melanogaster genome, a minimal tiling path of subclones across much of the genome was determined. About 320,000 shotgun clones for chromosomes X(12-20), 2R, 2L, 3R, and 4 were available from the Berkeley Drosophila Genome Project. The clone inserts have an average length of 3.4 kb and are amenable to standard PCR amplification. The resulting tiling path covers 86.2% of chromosome X(12-20), 86.2% of chromosomal arm 2R, 79.0% of 2L, 89.6% of 3R, and 80.5% of chromosome 4. In total, the 25,135 clones represent 76.7 Mb--equivalent to about 67% of the genome--and would be suitable for producing a microarray on a single slide.


Assuntos
Mapeamento Cromossômico/métodos , Clonagem Molecular/métodos , Drosophila melanogaster/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de DNA/métodos , Animais
7.
Nucleic Acids Res ; 32(Database issue): D138-41, 2004 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-14681378

RESUMO

Pfam is a large collection of protein families and domains. Over the past 2 years the number of families in Pfam has doubled and now stands at 6190 (version 10.0). Methodology improvements for searching the Pfam collection locally as well as via the web are described. Other recent innovations include modelling of discontinuous domains allowing Pfam domain definitions to be closer to those found in structure databases. Pfam is available on the web in the UK (http://www.sanger.ac.uk/Software/Pfam/), the USA (http://pfam.wustl.edu/), France (http://pfam.jouy.inra.fr/) and Sweden (http://Pfam.cgb.ki.se/).


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Proteínas/classificação , Animais , Biologia Computacional , Humanos , Internet , Modelos Moleculares , Família Multigênica , Estrutura Terciária de Proteína
8.
Bioinformatics ; 18(9): 1272-3, 2002 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-12217923

RESUMO

SUMMARY: Orthostrapper is a program that calculates orthology support values for pairs of sequences in a multiple alignment (Storm and Sonnhammer, Bioinformatics, 18, 92-99, 2002). Here we present OrthoGUI, a web interface and display tool for Orthostrapper analysis. OrthoGUI visualizes the Orthostrapper output in both tabular and tree representations, and can also apply a clustering algorithm to identify groups of multiple orthologs, which are indicated by colour coding. AVAILABILITY: http://www.cgb.ki.se/OrthoGUI CONTACT: erik.sonnhammer@cgb.ki.se


Assuntos
Gráficos por Computador , Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Homologia de Sequência , Transportadores de Cassetes de Ligação de ATP/genética , Transportadores de Cassetes de Ligação de ATP/metabolismo , Trifosfato de Adenosina/metabolismo , Algoritmos , Apresentação de Dados , Humanos , Internet , Ligação Proteica/genética , Sensibilidade e Especificidade , Especificidade da Espécie , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...