Pesquisa | Portal Regional da BVS (teste)

pQuant improves quantitation by keeping out interfering signals and evaluating the accuracy of calculated ratios.

Liu, Chao; Song, Chun-Qing; Yuan, Zuo-Fei; Fu, Yan; Chi, Hao; Wang, Le-Heng; Fan, Sheng-Bo; Zhang, Kun; Zeng, Wen-Feng; He, Si-Min; Dong, Meng-Qiu; Sun, Rui-Xiang.

Anal Chem ; 86(11): 5286-94, 2014 Jun 03.

Artigo em Inglês | MEDLINE | ID: mdl-24799117

RESUMO

In relative protein abundance determination from peptide intensities recorded in full mass scans, a major complication that affects quantitation accuracy is signal interference from coeluting ions of similar m/z values. Here, we present pQuant, a quantitation software tool that solves this problem. pQuant detects interference signals, identifies for each peptide a pair of least interfered isotopic chromatograms: one for the light and one for the heavy isotope-labeled peptide. On the basis of these isotopic pairs, pQuant calculates the relative heavy/light peptide ratios along with their 99.75% confidence intervals (CIs). From the peptides ratios and their CIs, pQuant estimates the protein ratios and associated CIs by kernel density estimation. We tested pQuant, Census and MaxQuant on data sets obtained from mixtures (at varying mixing ratios from 10:1 to 1:10) of light- and heavy-SILAC labeled HeLa cells or (14)N- and (15)N-labeled Escherichia coli cells. pQuant quantitated more peptides with better accuracy than Census and MaxQuant in all 14 data sets. On the SILAC data sets, the nonquantified "NaN" (not a number) ratios generated by Census, MaxQuant, and pQuant accounted for 2.5-10.7%, 1.8-2.7%, and 0.01-0.5% of all ratios, respectively. On the (14)N/(15)N data sets, which cannot be quantified by MaxQuant, Census and pQuant produced 0.9-10.0% and 0.3-2.9% NaN ratios, respectively. Excluding these NaN results, the standard deviations of the numerical ratios calculated by Census or MaxQuant are 30-100% larger than those by pQuant. These results show that pQuant outperforms Census and MaxQuant in SILAC and (15)N-based quantitation.

Assuntos

Peptídeos/química , Proteínas/química , Escherichia coli/química , Células HeLa/química , Humanos , Isótopos , Espectrometria de Massas , Isótopos de Nitrogênio , Radioisótopos de Nitrogênio , Software

Identification of cross-linked peptides from complex samples.

Yang, Bing; Wu, Yan-Jie; Zhu, Ming; Fan, Sheng-Bo; Lin, Jinzhong; Zhang, Kun; Li, Shuang; Chi, Hao; Li, Yu-Xin; Chen, Hai-Feng; Luo, Shu-Kun; Ding, Yue-He; Wang, Le-Heng; Hao, Zhiqi; Xiu, Li-Yun; Chen, She; Ye, Keqiong; He, Si-Min; Dong, Meng-Qiu.

Nat Methods ; 9(9): 904-6, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22772728

RESUMO

We have developed pLink, software for data analysis of cross-linked proteins coupled with mass-spectrometry analysis. pLink reliably estimates false discovery rate in cross-link identification and is compatible with multiple homo- or hetero-bifunctional cross-linkers. We validated the program with proteins of known structures, and we further tested it on protein complexes, crude immunoprecipitates and whole-cell lysates. We show that it is a robust tool for protein-structure and protein-protein-interaction studies.

Assuntos

Reagentes de Ligações Cruzadas/química , Peptídeos/análise , Peptídeos/química , Proteômica/métodos , Algoritmos , Animais , Caenorhabditis elegans/química , Cromatografia Líquida de Alta Pressão , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Escherichia coli/química , Reações Falso-Positivas , Humanos , Espectrometria de Massas , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Reprodutibilidade dos Testes , Software

pParse: a method for accurate determination of monoisotopic peaks in high-resolution mass spectra.

Yuan, Zuo-Fei; Liu, Chao; Wang, Hai-Peng; Sun, Rui-Xiang; Fu, Yan; Zhang, Jing-Fen; Wang, Le-Heng; Chi, Hao; Li, You; Xiu, Li-Yun; Wang, Wen-Ping; He, Si-Min.

Proteomics ; 12(2): 226-35, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22106041

RESUMO

Determining the monoisotopic peak of a precursor is a first step in interpreting mass spectra, which is basic but non-trivial. The reason is that in the isolation window of a precursor, other peaks interfere with the determination of the monoisotopic peak, leading to wrong mass-to-charge ratio or charge state. Here we propose a method, named pParse, to export the most probable monoisotopic peaks for precursors, including co-eluted precursors. We use the relationship between the position of the highest peak and the mass of the first peak to detect candidate clusters. Then, we extract three features to sort the candidate clusters: (i) the sum of the intensity, (ii) the similarity of the experimental and the theoretical isotopic distribution, and (iii) the similarity of elution profiles. We showed that the recall of pParse, MaxQuant, and BioWorks was 98-98.8%, 0.5-17%, and 1.8-36.5% at the same precision, respectively. About 50% of tandem mass spectra are triggered by multiple precursors which are difficult to identify. Then we design a new scoring function to identify the co-eluted precursors. About 26% of all identified peptides were exclusively from co-eluted peptides. Therefore, accurately determining monoisotopic peaks, including co-eluted precursors, can greatly increase peptide identification rate.

Assuntos

Peptídeos/análise , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Algoritmos , Células HeLa/química , Humanos , Peptídeos/química , Precursores de Proteínas/análise , Precursores de Proteínas/química , Reprodutibilidade dos Testes , Ferramenta de Busca , Sensibilidade e Especificidade , Fatores de Tempo , Leveduras/química

Speeding up tandem mass spectrometry-based database searching by longest common prefix.

Zhou, Chen; Chi, Hao; Wang, Le-Heng; Li, You; Wu, Yan-Jie; Fu, Yan; Sun, Rui-Xiang; He, Si-Min.

BMC Bioinformatics ; 11: 577, 2010 Nov 25.

Artigo em Inglês | MEDLINE | ID: mdl-21108792

RESUMO

BACKGROUND: Tandem mass spectrometry-based database searching has become an important technology for peptide and protein identification. One of the key challenges in database searching is the remarkable increase in computational demand, brought about by the expansion of protein databases, semi- or non-specific enzymatic digestion, post-translational modifications and other factors. Some software tools choose peptide indexing to accelerate processing. However, peptide indexing requires a large amount of time and space for construction, especially for the non-specific digestion. Additionally, it is not flexible to use. RESULTS: We developed an algorithm based on the longest common prefix (ABLCP) to efficiently organize a protein sequence database. The longest common prefix is a data structure that is always coupled to the suffix array. It eliminates redundant candidate peptides in databases and reduces the corresponding peptide-spectrum matching times, thereby decreasing the identification time. This algorithm is based on the property of the longest common prefix. Even enzymatic digestion poses a challenge to this property, but some adjustments can be made to this algorithm to ensure that no candidate peptides are omitted. Compared with peptide indexing, ABLCP requires much less time and space for construction and is subject to fewer restrictions. CONCLUSIONS: The ABLCP algorithm can help to improve data analysis efficiency. A software tool implementing this algorithm is available at http://pfind.ict.ac.cn/pfind2dot5/index.htm.

Assuntos

Bases de Dados de Proteínas , Proteínas/química , Espectrometria de Massas em Tandem/métodos , Algoritmos , Mapeamento de Peptídeos , Peptídeos/química , Análise de Sequência de Proteína/métodos

Improved peptide identification for proteomic analysis based on comprehensive characterization of electron transfer dissociation spectra.

Sun, Rui-Xiang; Dong, Meng-Qiu; Song, Chun-Qing; Chi, Hao; Yang, Bing; Xiu, Li-Yun; Tao, Li; Jing, Zhi-Yi; Liu, Chao; Wang, Le-Heng; Fu, Yan; He, Si-Min.

J Proteome Res ; 9(12): 6354-67, 2010 Dec 03.

Artigo em Inglês | MEDLINE | ID: mdl-20883037

RESUMO

In recent years, electron transfer dissociation (ETD) has enjoyed widespread applications from sequencing of peptides with or without post-translational modifications to top-down analysis of intact proteins. However, peptide identification rates from ETD spectra compare poorly with those from collision induced dissociation (CID) spectra, especially for doubly charged precursors. This is in part due to an insufficient understanding of the characteristics of ETD and consequently a failure of database search engines to make use of the rich information contained in the ETD spectra. In this study, we statistically characterized ETD fragmentation patterns from a collection of 461 440 spectra and subsequently implemented our findings into pFind, a database search engine developed earlier for CID data. From ETD spectra of doubly charged precursors, pFind 2.1 identified 63-122% more unique peptides than Mascot 2.2 under the same 1% false discovery rate. For higher charged peptides as well as phosphopeptides, pFind 2.1 also consistently obtained more identifications. Of the features built into pFind 2.1, the following two greatly enhanced its performance: (1) refined automatic detection and removal of high-intensity peaks belonging to the precursor, charge-reduced precursor, or related neutral loss species, whose presence often set spectral matching askew; (2) a thorough consideration of hydrogen-rearranged fragment ions such as z + H and c - H for peptide precursors of different charge states. Our study has revealed that different charge states of precursors result in different hydrogen rearrangement patterns. For a fragment ion, its propensity of gaining or losing a hydrogen depends on (1) the ion type (c or z) and (2) the size of the fragment relative to the precursor, and both dependencies are affected by (3) the charge state of the precursor. In addition, we discovered ETD characteristics that are unique for certain types of amino acids (AAs), such as a prominent neutral loss of SCH(2)CONH(2) (90.0014 Da) from z ions with a carbamidomethylated cysteine at the N-terminus and a neutral loss of histidine side chain C(4)N(2)H(5) (81.0453 Da) from precursor ions containing histidine. The comprehensive list of ETD characteristics summarized in this paper should be valuable for automated database search, de novo peptide sequencing, and manual spectral validation.

Assuntos

Espectrometria de Massas/métodos , Peptídeos/análise , Proteômica/métodos , Sequência de Aminoácidos , Transporte de Elétrons , Dados de Sequência Molecular , Peptídeos/química , Fosfopeptídeos/análise , Fosfopeptídeos/química , Reprodutibilidade dos Testes

pNovo: de novo peptide sequencing and identification using HCD spectra.

Chi, Hao; Sun, Rui-Xiang; Yang, Bing; Song, Chun-Qing; Wang, Le-Heng; Liu, Chao; Fu, Yan; Yuan, Zuo-Fei; Wang, Hai-Peng; He, Si-Min; Dong, Meng-Qiu.

J Proteome Res ; 9(5): 2713-24, 2010 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-20329752

RESUMO

De novo peptide sequencing has improved remarkably in the past decade as a result of better instruments and computational algorithms. However, de novo sequencing can correctly interpret only approximately 30% of high- and medium-quality spectra generated by collision-induced dissociation (CID), which is much less than database search. This is mainly due to incomplete fragmentation and overlap of different ion series in CID spectra. In this study, we show that higher-energy collisional dissociation (HCD) is of great help to de novo sequencing because it produces high mass accuracy tandem mass spectrometry (MS/MS) spectra without the low-mass cutoff associated with CID in ion trap instruments. Besides, abundant internal and immonium ions in the HCD spectra can help differentiate similar peptide sequences. Taking advantage of these characteristics, we developed an algorithm called pNovo for efficient de novo sequencing of peptides from HCD spectra. pNovo gave correct identifications to 80% or more of the HCD spectra identified by database search. The number of correct full-length peptides sequenced by pNovo is comparable with that obtained by database search. A distinct advantage of de novo sequencing is that deamidated peptides and peptides with amino acid mutations can be identified efficiently without extra cost in computation. In summary, implementation of the HCD characteristics makes pNovo an excellent tool for de novo peptide sequencing from HCD spectra.

Assuntos

Algoritmos , Fragmentos de Peptídeos/química , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Animais , Bovinos , Galinhas , Mineração de Dados , Bases de Dados de Proteínas , Proteínas de Escherichia coli , Dados de Sequência Molecular , Proteínas/química , Coelhos , Software , Glycine max

Speeding up tandem mass spectrometry based database searching by peptide and spectrum indexing.

Li, You; Chi, Hao; Wang, Le-Heng; Wang, Hai-Peng; Fu, Yan; Yuan, Zuo-Fei; Li, Su-Jun; Liu, Yan-Sheng; Sun, Rui-Xiang; Zeng, Rong; He, Si-Min.

Rapid Commun Mass Spectrom ; 24(6): 807-14, 2010 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-20187083

RESUMO

Database searching is the technique of choice for shotgun proteomics, and to date much research effort has been spent on improving its effectiveness. However, database searching faces a serious challenge of efficiency, considering the large numbers of mass spectra and the ever fast increase in peptide databases resulting from genome translations, enzymatic digestions, and post-translational modifications. In this study, we conducted systematic research on speeding up database search engines for protein identification and illustrate the key points with the specific design of the pFind 2.1 search engine as a running example. Firstly, by constructing peptide indexes, pFind achieves a speedup of two to three compared with that without peptide indexes. Secondly, by constructing indexes for observed precursor and fragment ions, pFind achieves another speedup of two. As a result, pFind compares very favorably with predominant search engines such as Mascot, SEQUEST and X!Tandem.

Assuntos

Mineração de Dados/métodos , Bases de Dados de Proteínas , Fragmentos de Peptídeos/química , Proteínas/química , Espectrometria de Massas em Tandem/métodos , Algoritmos , Proteínas Sanguíneas/química , Simulação por Computador , Sistemas de Gerenciamento de Base de Dados , Proteínas Fúngicas/química , Humanos , Proteômica/métodos

A strategy for precise and large scale identification of core fucosylated glycoproteins.

Jia, Wei; Lu, Zhuang; Fu, Yan; Wang, Hai-Peng; Wang, Le-Heng; Chi, Hao; Yuan, Zuo-Fei; Zheng, Zhao-Bin; Song, Li-Na; Han, Huan-Huan; Liang, Yi-Min; Wang, Jing-Lan; Cai, Yun; Zhang, Yu-Kui; Deng, Yu-Lin; Ying, Wan-Tao; He, Si-Min; Qian, Xiao-Hong.

Mol Cell Proteomics ; 8(5): 913-23, 2009 May.

Artigo em Inglês | MEDLINE | ID: mdl-19139490

RESUMO

Core fucosylation (CF) patterns of some glycoproteins are more sensitive and specific than evaluation of their total respective protein levels for diagnosis of many diseases, such as cancers. Global profiling and quantitative characterization of CF glycoproteins may reveal potent biomarkers for clinical applications. However, current techniques are unable to reveal CF glycoproteins precisely on a large scale. Here we developed a robust strategy that integrates molecular weight cutoff, neutral loss-dependent MS(3), database-independent candidate spectrum filtering, and optimization to effectively identify CF glycoproteins. The rationale for spectrum treatment was innovatively based on computation of the mass distribution in spectra of CF glycopeptides. The efficacy of this strategy was demonstrated by implementation for plasma from healthy subjects and subjects with hepatocellular carcinoma. Over 100 CF glycoproteins and CF sites were identified, and over 10,000 mass spectra of CF glycopeptide were found. The scale of identification results indicates great progress for finding biomarkers with a particular and attractive prospect, and the candidate spectra will be a useful resource for the improvement of database searching methods for glycopeptides.

Assuntos

Fucose/metabolismo , Glicoproteínas/análise , Proteômica/métodos , Acetilglucosamina/metabolismo , Sequência de Aminoácidos , Pesquisa Biomédica , Glicopeptídeos/sangue , Glicopeptídeos/química , Glicosilação , Humanos , Espectrometria de Massas , Dados de Sequência Molecular , Ultrafiltração

pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry.

Wang, Le-Heng; Li, De-Quan; Fu, Yan; Wang, Hai-Peng; Zhang, Jing-Fen; Yuan, Zuo-Fei; Sun, Rui-Xiang; Zeng, Rong; He, Si-Min; Gao, Wen.

Rapid Commun Mass Spectrom ; 21(18): 2985-91, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17702057

RESUMO

This paper describes the pFind 2.0 software package for peptide and protein identification via tandem mass spectrometry. Firstly, the most important feature of pFind 2.0 is that it offers a modularized and customized platform for third parties to test and compare their algorithms. The developers can create their own modules following the open application programming interface (API) standards and then add it into workflows in place of the default modules. In addition, to accommodate different requirements, the package provides four automated workflows adopting different algorithm modules, executing processes and result reports. Based on this design, pFind 2.0 provides an automated target-decoy database search strategy: The user can just specify a certain false positive rate (FPR) and start searching. Then the system will return the protein identification results automatically filtered by such an estimated FPR. Secondly, pFind 2.0 is also of high accuracy and high speed. Many pragmatic preprocessing, peptide-scoring, validation, and protein inference algorithms have been incorporated. To speed up the searching process, a toolbox for indexing protein databases is developed for high-throughput applications and all modules are implemented under a new architecture designed for large-scale parallel and distributed searching. An experiment on a public dataset shows that pFind 2.0 can identify more peptides than SEQUEST and Mascot at the 1% FPR. It is also demonstrated that this version of pFind 2.0 has better usability and higher speed than its previous versions. The software and more detailed supplementary information can both be accessed at http://pfind.ict.ac.cn/.

Assuntos

Espectrometria de Massas/métodos , Mapeamento de Peptídeos/métodos , Peptídeos/química , Proteínas/química , Análise de Sequência de Proteína/métodos , Software , Interface Usuário-Computador , Algoritmos , Sequência de Aminoácidos , Gráficos por Computador , Dados de Sequência Molecular , Validação de Programas de Computador

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA