Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Bioinform Comput Biol ; 13(5): 1543003, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26542446

RESUMO

Determining the entire complement of enzymes and their enzymatic functions is a fundamental step for reconstructing the metabolic network of cells. High quality enzyme annotation helps in enhancing metabolic networks reconstructed from the genome, especially by reducing gaps and increasing the enzyme coverage. Currently, structure-based and network-based approaches can only cover a limited number of enzyme families, and the accuracy of homology-based approaches can be further improved. Bottom-up homology-based approach improves the coverage by rebuilding Hidden Markov Model (HMM) profiles for all known enzymes. However, its clustering procedure relies firmly on BLAST similarity score, ignoring protein domains/patterns, and is sensitive to changes in cut-off thresholds. Here, we use functional domain architecture to score the association between domain families and enzyme families (Domain-Enzyme Association Scoring, DEAS). The DEAS score is used to calculate the similarity between proteins, which is then used in clustering procedure, instead of using sequence similarity score. We improve the enzyme annotation protocol using a stringent classification procedure, and by choosing optimal threshold settings and checking for active sites. Our analysis shows that our stringent protocol EnzDP can cover up to 90% of enzyme families available in Swiss-Prot. It achieves a high accuracy of 94.5% based on five-fold cross-validation. EnzDP outperforms existing methods across several testing scenarios. Thus, EnzDP serves as a reliable automated tool for enzyme annotation and metabolic network reconstruction. Available at: www.comp.nus.edu.sg/~nguyennn/EnzDP .


Assuntos
Biologia Computacional/métodos , Enzimas/química , Enzimas/metabolismo , Redes e Vias Metabólicas , Domínio Catalítico , Análise por Conglomerados , Bases de Dados de Proteínas , Enzimas/classificação , Aprendizado de Máquina , Cadeias de Markov , Estrutura Terciária de Proteína , Alinhamento de Sequência , Homologia Estrutural de Proteína
2.
J Bioinform Comput Biol ; 11(6): 1343004, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24372033

RESUMO

A synteny block represents a set of contiguous genes located within the same chromosome and well conserved among various species. Through long evolutionary processes and genome rearrangement events, large numbers of synteny blocks remain highly conserved across multiple species. Understanding distribution of conserved gene blocks facilitates evolutionary biologists to trace the diversity of life, and it also plays an important role for orthologous gene detection and gene annotation in the genomic era. In this work, we focus on collinear synteny detection in which the order of genes is required and well conserved among multiple species. To achieve this goal, the suffix tree based algorithms for efficiently identifying homologous synteny blocks was proposed. The traditional suffix tree algorithm was modified by considering a chromosome as a string and each gene in a chromosome is encoded as a symbol character. Hence, a suffix tree can be built for different query chromosomes from various species. We can then efficiently search for conserved synteny blocks that are modeled as overlapped contiguous edges in our suffix tree. In addition, we defined a novel Synteny Block Conserved Index (SBCI) to evaluate the relationship of synteny block distribution between two species, and which could be applied as an evolutionary indicator for constructing a phylogenetic tree from multiple species instead of performing large computational requirements through whole genome sequence alignment.


Assuntos
Algoritmos , Evolução Molecular , Modelos Genéticos , Sintenia , Animais , Cromossomos , Genoma , Humanos
3.
J Bioinform Comput Biol ; 10(6): 1231002, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22867628

RESUMO

This paper is a self-contained introductory tutorial on the problem in proteomics known as peptide sequencing using tandem mass spectrometry. This tutorial deals specifically with de novo sequencing methods (as opposed to database search methods). We first give an introduction to peptide sequencing, its importance and history and some background on proteins. Next we show the relationship between a peptide and the final spectrum produced from a tandem mass spectrometer, together with a description of the various sources of complications that arise during the process of generating the mass spectrum. From there we model the computational problem of de novo peptide sequencing, which is basically the reverse problem of identifying the peptide which produced the spectrum. We then present several major approaches to solve it (including reviewing some of the current algorithms in each approach), and also discuss related problems and post-processing approaches.


Assuntos
Espectrometria de Massas/métodos , Peptídeos/química , Proteômica/métodos , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dados de Sequência Molecular , Proteínas/química , Análise de Sequência de Proteína , Espectrometria de Massas em Tandem/métodos
4.
Int J Data Min Bioinform ; 1(4): 372-88, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18402048

RESUMO

Dead End Elimination (DEE) is a technique for eliminating rotamers that can not exist in any global minimum energy configuration for the protein side chain conformation problem. A popular method is Simple Goldstein DEE (SG-DEE) which is fast and eliminates rotamers by considering single residues for possible elimination. We present a Merge-Decoupling DEE (MD-DEE) that further reduces the number of rotamers after SG-DEE. MD-DEE works by forming residue-pairs but is fast and, like SG-DEE, is practical even for large proteins. Our experiments show that MD-DEE achieves further reduction in residue elimination (up to 25%) after SG-DEE.


Assuntos
Algoritmos , Sequência de Aminoácidos , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína
5.
J Bioinform Comput Biol ; 4(6): 1329-52, 2006 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17245817

RESUMO

Peptide sequencing using tandem mass spectrometry data is an important and challenging problem in proteomics. We address the problem of peptide sequencing for multi-charge spectra. Most peptide sequencing algorithms currently consider only charge one or two ions even for higher-charge spectra. We give a characterization of multi-charge spectra by generalizing existing models. Using our models, we analyzed spectra from Global Proteome Machine (GPM) [Craig R, Cortens JP, Beavis RC, J Proteome Res 3:1234-1242, 2004.] (with charges 1-5), Institute for Systems Biology (ISB) [Keller A, Purvine S, Nesvizhskii AI, Stolyar S, Goodlett DR, Kolker E, OMICS 6:207-212, 2002.] and Orbitrap (both with charges 1-3). Our analysis for the GPM dataset shows that higher charge peaks contribute significantly to prediction of the complete peptide. They also help to explain why existing algorithms do not perform well on multi-charge spectra. Based on these analyses, we claim that peptide sequencing algorithms can achieve higher sensitivity results if they also consider higher charge ions. We verify this claim by proposing a de novo sequencing algorithm called the greedy best strong tag (GBST) algorithm that is simple but considers higher charge ions based on our new model. Evaluation on multi-charge spectra shows that our simple GBST algorithm outperforms Lutefisk and PepNovo, especially for the GPM spectra of charge three or more.


Assuntos
Algoritmos , Espectrometria de Massas/métodos , Modelos Químicos , Mapeamento de Peptídeos/métodos , Peptídeos/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Simulação por Computador , Dados de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...