Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Methods ; 10(6): 563-9, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23644548

RESUMO

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.


Assuntos
Genoma Bacteriano , Análise de Sequência de DNA/métodos , Cromossomos Artificiais Bacterianos , Escherichia coli/genética , Biblioteca Gênica , Humanos , Sequências Repetitivas de Ácido Nucleico
2.
J Proteome Res ; 8(4): 2106-13, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19275164

RESUMO

Obtaining accurate peptide identifications from shotgun proteomics liquid chromatography tandem mass spectrometry (LC-MS/MS) experiments requires a score function that consistently ranks correct peptide-spectrum matches (PSMs) above incorrect matches. We have observed that, for the Sequest score function Xcorr, the inability to discriminate between correct and incorrect PSMs is due in part to spectrum-specific properties of the score distribution. In other words, some spectra score well regardless of which peptides they are scored against, and other spectra score well because they are scored against a large number of peptides. We describe a protocol for calibrating PSM score functions, and we demonstrate its application to Xcorr and the preliminary Sequest score function Sp. The protocol accounts for spectrum- and peptide-specific effects by calculating p values for each spectrum individually, using only that spectrum's score distribution. We demonstrate that these calculated p values are uniform under a null distribution and therefore accurately measure significance. These p values can be used to estimate the false discovery rate, therefore, eliminating the need for an extra search against a decoy database. In addition, we show that the pvalues are better calibrated than their underlying scores; consequently, when ranking top-scoring PSMs from multiple spectra, p values are better at discriminating between correct and incorrect PSMs. The calibration protocol is generally applicable to any PSM score function for which an appopriate parametric family can be identified.


Assuntos
Algoritmos , Software , Espectrometria de Massas em Tandem/métodos , Calibragem
3.
Bioinformatics ; 24(13): i348-56, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18586734

RESUMO

MOTIVATION: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms. RESULTS: We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate. AVAILABILITY: Python and C source code are available upon request from the authors. The curated training sets are available at http://noble.gs.washington.edu/proj/intense/. The Graphical Model Tool Kit (GMTK) is freely available at http://ssli.ee.washington.edu/bilmes/gmtk.


Assuntos
Algoritmos , Inteligência Artificial , Espectrometria de Massas/métodos , Reconhecimento Automatizado de Padrão/métodos , Mapeamento de Peptídeos/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Teorema de Bayes , Dados de Sequência Molecular
4.
J Proteome Res ; 7(7): 3022-7, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18505281

RESUMO

Mass spectrometry, the core technology in the field of proteomics, promises to enable scientists to identify and quantify the entire complement of proteins in a complex biological sample. Currently, the primary bottleneck in this type of experiment is computational. Existing algorithms for interpreting mass spectra are slow and fail to identify a large proportion of the given spectra. We describe a database search program called Crux that reimplements and extends the widely used database search program Sequest. For speed, Crux uses a peptide indexing scheme to rapidly retrieve candidate peptides for a given spectrum. For each peptide in the target database, Crux generates shuffled decoy peptides on the fly, providing a good null model and, hence, accurate false discovery rate estimates. Crux also implements two recently described postprocessing methods: a p value calculation based upon fitting a Weibull distribution to the observed scores, and a semisupervised method that learns to discriminate between target and decoy matches. Both methods significantly improve the overall rate of peptide identification. Crux is implemented in C and is distributed with source code freely to noncommercial users.


Assuntos
Peptídeos/análise , Algoritmos , Biologia Computacional , Bases de Dados Factuais , Humanos , Fragmentos de Peptídeos/análise , Proteômica , Proteínas de Saccharomyces cerevisiae/análise , Software , Espectrometria de Massas em Tandem
5.
Anal Chem ; 79(16): 6111-8, 2007 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-17622186

RESUMO

Most algorithms for identifying peptides from tandem mass spectra use information only from the final spectrum, ignoring non-mass-based information acquired routinely in liquid chromatography tandem mass spectrometry analyses. One physiochemical property that is always obtained but rarely exploited is peptide chromatographic retention time. Efforts to use chromatographic retention time to improve peptide identification are complicated because of the variability of retention time in different experimental conditions-making retention time calculations nongeneralizable. We show that peptide retention time can be reliably predicted by training and testing a support vector regressor on a small collection of data from a single liquid chromatography run. This model can be used to filter peptide identifications with observed retention time that deviates from predicted retention time. After filtering, positive peptide identifications increase by as much as 50% at a false discovery rate of 3%. We demonstrate that our dynamically trained model generalizes well across diverse chromatography conditions and methods for generating peptides, in particular improving peptide identification using nonspecific proteases.


Assuntos
Inteligência Artificial , Cromatografia Líquida/métodos , Peptídeos/análise , Espectrometria de Massas em Tandem/métodos , Reações Falso-Positivas , Peptídeo Hidrolases , Espectrometria de Massas em Tandem/normas
6.
J Proteome Res ; 5(3): 695-700, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16512685

RESUMO

In shotgun proteomics, a complex protein mixture is digested to peptides, separated, and identified by microcapillary liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS). In this technology, complete protein digestion is often assumed. We show that, to the contrary, modifications to a standard digestion protocol demonstrate large, reproducible improvements in protein identification, a result consistent with digestion being a limiting factor in the efficiency of protein identification.


Assuntos
Misturas Complexas/análise , Peptídeo Hidrolases , Proteínas/análise , Proteínas/metabolismo , Misturas Complexas/química , Proteínas de Escherichia coli/análise , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Espectrometria de Massas , Proteínas/química , Proteômica
7.
Anal Chem ; 78(4): 1337-44, 2006 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-16478131

RESUMO

A 2D ion trap has a greater ion trapping efficiency, greater ion capacity before observing space-charging effects, and a faster ion ejection rate than a traditional 3D ion trap mass spectrometer. These hardware improvements should result in a significant increase in protein identifications from complex mixtures analyzed using shotgun proteomics. In this study, we compare the quality and quantity of peptide identifications using data-dependent acquisition of tandem mass spectra of peptides between two commercially available ion trap mass spectrometers (an LTQ and an LCQ XP Max). We demonstrate that the increased trapping efficiency, increased ion capacity, and faster ion ejection rate of the LTQ results in greater than 5-fold more protein identifications, better identification of low-abundance proteins, and higher confidence protein identifications when compared with a LCQ XP Max.


Assuntos
Espectrometria de Massas/métodos , Proteômica , Sequência de Aminoácidos , Dados de Sequência Molecular
8.
Artigo em Inglês | MEDLINE | ID: mdl-16447975

RESUMO

Mass spectrometry is a particularly useful technology for the rapid and robust identification of peptides and proteins in complex mixtures. Peptide sequences can be identified by correlating their observed tandem mass spectra (MS/MS) with theoretical spectra of peptides from a sequence database. Unfortunately, to perform this search the charge of the peptide must be known, and current chargestate- determination algorithms only discriminate singlyfrom multiply-charged spectra: distinguishing +2 from +3, for example, is unreliable. Thus, search software is forced to search multiply-charged spectra multiple times. To minimize this inefficiency, we present a support vector machine (SVM) that quickly and reliably classifies multiplycharged spectra as having either a +2 or +3 precursor peptide ion. By classifying multiply-charged spectra, we obtain a 40% reduction in search time while maintaining an average of 99% of peptide and 99% of protein identifications originally obtained from these spectra.


Assuntos
Espectrometria de Massas/métodos , Modelos Químicos , Reconhecimento Automatizado de Padrão/métodos , Mapeamento de Peptídeos/métodos , Peptídeos/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Inteligência Artificial , Simulação por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Eletricidade Estática
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...