Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genome Res ; 11(2): 290-9, 2001 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-11157792

RESUMO

Although protein identification by matching tandem mass spectra (MS/MS) against protein databases is a widespread tool in mass spectrometry, the question about reliability of such searches remains open. Absence of rigorous significance scores in MS/MS database search makes it difficult to discard random database hits and may lead to erroneous protein identification, particularly in the case of mutated or post-translationally modified peptides. This problem is especially important for high-throughput MS/MS projects when the possibility of expert analysis is limited. Thus, algorithms that sort out reliable database hits from unreliable ones and identify mutated and modified peptides are sought. Most MS/MS database search algorithms rely on variations of the Shared Peaks Count approach that scores pairs of spectra by the peaks (masses) they have in common. Although this approach proved to be useful, it has a high error rate in identification of mutated and modified peptides. We describe new MS/MS database search tools, MS-CONVOLUTION and MS-ALIGNMENT, which implement the spectral convolution and spectral alignment approaches to peptide identification. We further analyze these approaches to identification of modified peptides and demonstrate their advantages over the Shared Peaks Count. We also use the spectral alignment approach as a filter in a new database search algorithm that reliably identifies peptides differing by up to two mutations/modifications from a peptide in a database.


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Espectrometria de Massas/métodos , Mutação , Proteínas/análise , Proteínas/genética , Algoritmos , Sequência de Aminoácidos , Biologia Computacional/estatística & dados numéricos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais/estatística & dados numéricos , Proteínas Fúngicas/análise , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Espectrometria de Massas/estatística & dados numéricos , Dados de Sequência Molecular , Fragmentos de Peptídeos/análise , Fragmentos de Peptídeos/genética , Fragmentos de Peptídeos/metabolismo , Proteínas/metabolismo
2.
J Comput Biol ; 7(6): 777-87, 2000.
Artigo em Inglês | MEDLINE | ID: mdl-11382361

RESUMO

Database search in tandem mass spectrometry is a powerful tool for protein identification. High-throughput spectral acquisition raises the problem of dealing with genetic variation and peptide modifications within a population of related proteins. A method that cross-correlates and clusters related spectra in large collections of uncharacterized spectra (i.e., from normal and diseased individuals) would be very valuable in functional proteomics. This problem is far from being simple since very similar peptides may have very different spectra. We introduce a new notion of spectral similarity that allows one to identify related spectra even if the corresponding peptides have multiple modifications/mutations. Based on this notion, we developed a new algorithm for mutation-tolerant database search as well as a method for cross-correlating related uncharacterized spectra.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Espectrometria de Massas/métodos , Mutação , Proteínas/genética , Bases de Dados Factuais , Proteínas/química , Software
3.
J Comput Biol ; 6(3-4): 327-42, 1999.
Artigo em Inglês | MEDLINE | ID: mdl-10582570

RESUMO

Peptide sequencing via tandem mass spectrometry (MS/MS) is one of the most powerful tools in proteomics for identifying proteins. Because complete genome sequences are accumulating rapidly, the recent trend in interpretation of MS/MS spectra has been database search. However, de novo MS/MS spectral interpretation remains an open problem typically involving manual interpretation by expert mass spectrometrists. We have developed a new algorithm, SHERENGA, for de novo interpretation that automatically learns fragment ion types and intensity thresholds from a collection of test spectra generated from any type of mass spectrometer. The test data are used to construct optimal path scoring in the graph representations of MS/MS spectra. A ranked list of high scoring paths corresponds to potential peptide sequences. SHERENGA is most useful for interpreting sequences of peptides resulting from unknown proteins and for validating the results of database search algorithms in fully automated, high-throughput peptide sequencing.


Assuntos
Algoritmos , Espectrometria de Massas/métodos , Peptídeos/química , Análise de Sequência/métodos , Sequência de Aminoácidos , Bases de Dados Factuais , Estudos de Avaliação como Assunto , Espectrometria de Massas/estatística & dados numéricos , Análise de Sequência/estatística & dados numéricos
4.
J Comput Biol ; 5(3): 505-15, 1998.
Artigo em Inglês | MEDLINE | ID: mdl-9773346

RESUMO

A fundamentally new molecular-biology approach in constructing restriction maps, Optical Mapping, has been developed by Schwartz et al. (1993). Using this method restriction maps are constructed by measuring the relevant fluorescence intensity and length measurements. However, it is difficult to directly estimate the restriction site locations of single DNA molecules based on these optical mapping data because of the precision of length measurements and the unknown number of true restriction sites in the data. We propose the use of a hierarchical Bayes model based on a mixture model with normals and random noise. In this model we explicitly consider the missing observation structure of the data, such as the orientations of molecules, the allocations of cutting sites to restriction sites, and the indicator variables of whether observed cut sites are true or false. Because of the complexity of the model, the large number of missing data, and the unknown number of restriction sites, we use Reversible-Jump Markov Chain Monte Carlo (MCMC) to estimate the number and the locations of the restriction sites. Since there exists a high multimodality due to unknown orientations of molecules, we also use a combination of our MCMC approach and the flipping algorithm suggested by Dancík and Waterman (1997). The study is highly computer-intensive and the development of an efficient algorithm is required.


Assuntos
Algoritmos , Mapeamento por Restrição/métodos , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo
5.
J Comput Biol ; 4(3): 275-96, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-9278060

RESUMO

We consider the problem of determining the three-dimensional folding of a protein given its one-dimensional amino acid sequence. We use the HP model for protein folding proposed by Dill (1985), which models protein as a chain of amino acid residues that are either hydrophobic or polar, and hydrophobic interactions are the dominant initial driving force for the protein folding. Hart and Istrail (1996a) gave approximation algorithms for folding proteins on the cubic lattice under the HP model. In this paper, we examine the choice of a lattice by considering its algorithmic and geometric implications and argue that the triangular lattice is a more reasonable choice. We present a set of folding rules for a triangular lattice and analyze the approximation ratio they achieve. In addition, we introduce a generalization of the HP model to account for residues having different levels of hydrophobicity. After describing the biological foundation for this generalization, we show that in the new model we are able to achieve similar constant factor approximation guarantees on the triangular lattice as were achieved in the standard HP model. While the structures derived from our folding rules are probably still far from biological reality, we hope that having a set of folding rules with different properties will yield more interesting folds when combined.


Assuntos
Modelos Químicos , Conformação Proteica , Dobramento de Proteína , Algoritmos , Modelos Moleculares
6.
J Comput Biol ; 4(2): 119-25, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-9228611

RESUMO

Optical mapping is a new technology for constructing restriction maps. Associated computational problems include aligning multiple partial restriction maps into a single "consensus" restriction map, and determining the correct orientation of each molecule, which was formalized as the Exclusive Binary Flip Cut (EBFC) Problem in (Muthukrishnan and Parida, 1997). Here we prove that the EBFC problem, as well as a number of its variants, are NP-complete. Therefore, they do not have efficient, that is, polynomial time solutions unless P = NP.


Assuntos
Algoritmos , Mapeamento por Restrição/métodos , Processamento de Imagem Assistida por Computador , Modelos Teóricos , Óptica e Fotônica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...