Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
J Proteome Res ; 10(7): 3060-75, 2011 Jul 01.
Article in English | MEDLINE | ID: mdl-21599010

ABSTRACT

When analyzing proteins in complex samples using tandem mass spectrometry of peptides generated by proteolysis, the inference of proteins can be ambiguous, even with well-validated peptides. Unresolved questions include whether to show all possible proteins vs a minimal list, what to do when proteins are inferred ambiguously, and how to quantify peptides that bridge multiple proteins, each with distinguishing evidence. Here we describe IsoformResolver, a peptide-centric protein inference algorithm that clusters proteins in two ways, one based on peptides experimentally identified from MS/MS spectra, and the other based on peptides derived from an in silico digest of the protein database. MS/MS-derived protein groups report minimal list proteins in the context of all possible proteins, without redundantly listing peptides. In silico-derived protein groups pull together functionally related proteins, providing stable identifiers. The peptide-centric grouping strategy used by IsoformResolver allows proteins to be displayed together when they share peptides in common, providing a comprehensive yet concise way to organize protein profiles. It also summarizes information on spectral counts and is especially useful for comparing results from multiple LC-MS/MS experiments. Finally, we examine the relatedness of proteins within IsoformResolver groups and compare its performance to other protein inference software.


Subject(s)
Data Mining/methods , Peptide Fragments/analysis , Protein Isoforms , Proteomics/methods , Algorithms , Chromatography, Liquid , Databases, Protein , Humans , Peptide Fragments/chemistry , Protein Isoforms/analysis , Protein Isoforms/chemistry , Software , Tandem Mass Spectrometry , Trypsin/metabolism
2.
J Proteome Res ; 9(8): 4152-60, 2010 Aug 06.
Article in English | MEDLINE | ID: mdl-20578722

ABSTRACT

A complicating factor for protein identification within complex mixtures by LC/MS/MS is the problem of "chimera" spectra, where two or more precursor ions with similar mass and retention time are co-sequenced by MS/MS. Chimera spectra show reduced scores due to unidentifiable fragment ions derived from contaminating parents. However, the extent of chimeras in LC/MS/MS data sets and their impact on protein identification workflows are incompletely understood. We report ChimeraCounter, a software program which detects chimeras in data sets collected on an Orbitrap/LTQ instrument. Evaluation of synthetic chimeras created from pairs of well-defined peptide MS/MS spectra reveal that chimeras reduce database search scores most significantly when contaminating fragment ion intensities exceed 20% of the targeted fragment ion intensities. In large-scale data sets, the identification rate for chimera MS/MS is 2-fold lower compared to nonchimera spectra. Importantly, this occurs in a manner which depends not on absolute precursor ion intensity, but on intensity relative to the median precursor intensity distribution. We further show that chimeras reduce the number of accepted peptide identifications by increasing false negatives while showing little increase in false positives. The results provide a framework for identifying chimeras and characterizing their contribution to the poorly understood false negative class of MS/MS.


Subject(s)
Peptides/analysis , Proteomics/methods , Software , Tandem Mass Spectrometry/methods , Cell Line, Tumor , Chromatography, Liquid , Computational Biology , Humans , Reproducibility of Results , Research Design , Tandem Mass Spectrometry/standards
3.
Mol Cell ; 34(1): 115-31, 2009 Apr 10.
Article in English | MEDLINE | ID: mdl-19362540

ABSTRACT

Melanoma and other cancers harbor oncogenic mutations in the protein kinase B-Raf, which leads to constitutive activation and dysregulation of MAP kinase signaling. In order to elucidate molecular determinants responsible for B-Raf control of cancer phenotypes, we present a method for phosphoprotein profiling, using negative ionization mass spectrometry to detect phosphopeptides based on their fragment ion signature caused by release of PO(3)(-). The method provides an alternative strategy for phosphoproteomics, circumventing affinity enrichment of phosphopeptides and isotopic labeling of samples. Ninety phosphorylation events were regulated by oncogenic B-Raf signaling, based on their responses to treating melanoma cells with MKK1/2 inhibitor. Regulated phosphoproteins included known signaling effectors and cytoskeletal regulators. We investigated MINERVA/FAM129B, a target belonging to a protein family with unknown category and function, and established the importance of this protein and its MAP kinase-dependent phosphorylation in controlling melanoma cell invasion into three-dimensional collagen matrix.


Subject(s)
Melanoma/metabolism , Proteomics , Proto-Oncogene Proteins B-raf/metabolism , Cell Line, Tumor , Humans , MAP Kinase Signaling System , Mass Spectrometry , Mutation , Phosphoproteins/analysis , Phosphoproteins/chemistry , Phosphoproteins/genetics , Phosphoproteins/metabolism , Phosphoproteins/physiology , Phosphorylation , Proto-Oncogene Proteins B-raf/genetics , Proto-Oncogene Proteins B-raf/physiology , Substrate Specificity
4.
Mol Cell Proteomics ; 8(4): 857-69, 2009 Apr.
Article in English | MEDLINE | ID: mdl-19106086

ABSTRACT

Identifying peptides from mass spectrometric fragmentation data (MS/MS spectra) using search strategies that map protein sequences to spectra is computationally expensive. An alternative strategy uses direct spectrum-to-spectrum matching against a reference library of previously observed MS/MS that has the advantage of evaluating matches using fragment ion intensities and other ion types than the simple set normally used. However, this approach is limited by the small sizes of the available peptide MS/MS libraries and the inability to evaluate the rate of false assignments. In this study, we observed good performance of simulated spectra generated by the kinetic model implemented in MassAnalyzer (Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908-3922; Zhang, Z. (2005) Prediction of low-energy collision-induced dissociation spectra of peptides with three or more charges. Anal. Chem. 77, 6364-6373) as a substitute for the reference libraries used by the spectrum-to-spectrum search programs X!Hunter and BiblioSpec and similar results in comparison with the spectrum-to-sequence program Mascot. We also demonstrate the use of simulated spectra for searching against decoy sequences to estimate false discovery rates. Although we found lower score discrimination with spectrum-to-spectrum searches than with Mascot, particularly for higher charge forms, comparable peptide assignments with low false discovery rate were achieved by examining consensus between X!Hunter and Mascot, filtering results by mass accuracy, and ignoring score thresholds. Protein identification results are comparable to those achieved when evaluating consensus between Sequest and Mascot. Run times with large scale data sets using X!Hunter with the simulated spectral library are 7 times faster than Mascot and 80 times faster than Sequest with the human International Protein Index (IPI) database. We conclude that simulated spectral libraries greatly expand the search space available for spectrum-to-spectrum searching while enabling principled analyses and that the approach can be used in consensus strategies for large scale studies while reducing search times.


Subject(s)
Computer Simulation , Mass Spectrometry , Peptide Library , Proteins/analysis , Amino Acid Sequence , Cell Line, Tumor , Databases, Protein , Humans , Peptides/chemistry , Proteins/chemistry , ROC Curve , Reference Standards , Software
5.
Mol Cell Proteomics ; 6(1): 1-17, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17018520

ABSTRACT

A major limitation in identifying peptides from complex mixtures by shotgun proteomics is the ability of search programs to accurately assign peptide sequences using mass spectrometric fragmentation spectra (MS/MS spectra). Manual analysis is used to assess borderline identifications; however, it is error-prone and time-consuming, and criteria for acceptance or rejection are not well defined. Here we report a Manual Analysis Emulator (MAE) program that evaluates results from search programs by implementing two commonly used criteria: 1) consistency of fragment ion intensities with predicted gas phase chemistry and 2) whether a high proportion of the ion intensity (proportion of ion current (PIC)) in the MS/MS spectra can be derived from the peptide sequence. To evaluate chemical plausibility, MAE utilizes similarity (Sim) scoring against theoretical spectra simulated by MassAnalyzer software (Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908-3922) using known gas phase chemical mechanisms. The results show that Sim scores provide significantly greater discrimination between correct and incorrect search results than achieved by Sequest XCorr scoring or Mascot Mowse scoring, allowing reliable automated validation of borderline cases. To evaluate PIC, MAE simplifies the DTA text files summarizing the MS/MS spectra and applies heuristic rules to classify the fragment ions. MAE output also provides data mining functions, which are illustrated by using PIC to identify spectral chimeras, where two or more peptide ions were sequenced together, as well as cases where fragmentation chemistry is not well predicted.


Subject(s)
Mass Spectrometry/methods , Peptides/analysis , Peptides/chemistry , Amino Acid Sequence , Databases, Protein , Humans , K562 Cells , Molecular Sequence Data , Neoplasm Proteins/chemistry , Protein Array Analysis , Proteomics , ROC Curve , Reproducibility of Results , Software
6.
J Proteome Res ; 5(3): 709-19, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16512687

ABSTRACT

An important strategy for "shotgun proteomics" profiling involves solution proteolysis of proteins, followed by peptide separation using multidimensional liquid chromatography and automated sequencing by mass spectrometry (LC-MS/MS). Several protocols for extracting and handling membrane proteins for shotgun proteomics experiments have been reported, but few direct comparisons of different protocols have been reported. We compare four methods for preparing membrane proteins from human cells, using acid labile surfactants (ALS), urea, and mixed organic-aqueous solvents. These methods were compared with respect to their efficiency of protein solubilization and proteolysis, peptide and protein recovery, membrane protein enrichment, and peptide coverage of transmembrane proteins. Overall, approximately 50-60% of proteins recovered were membrane-associated, identified from Gene Ontology annotations and transmembrane prediction software. Samples extracted with ALS, extracted with urea followed by dilution, or extracted with urea followed by desalting yielded comparable peptide recoveries and sequence coverage of transmembrane proteins. In contrast, suboptimal proteolysis was observed with organic solvent. Urea extraction followed by desalting may be a particularly useful approach, as it is less costly than ALS and yields satisfactory protein denaturation and proteolysis under conditions that minimize reactivity with urea-derived cyanate. Spectral counting was used to compare datasets of proteins from membrane samples with those of soluble proteins from K562 cells, and to estimate fold differences in protein abundances. Proteins most highly abundant in the membrane samples showed enrichment of integral membrane protein identifications, consistent with their isolation by differential centrifugation.


Subject(s)
Cell Extracts/analysis , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/metabolism , Membrane Proteins/analysis , Neoplasm Proteins/chemistry , Chromatography, Liquid , Electrophoresis, Polyacrylamide Gel , Humans , K562 Cells , Neoplasm Proteins/analysis , Tandem Mass Spectrometry
7.
Anal Chem ; 78(4): 1071-84, 2006 Feb 15.
Article in English | MEDLINE | ID: mdl-16478097

ABSTRACT

Correct identification of a peptide sequence from MS/MS data is still a challenging research problem, particularly in proteomic analyses of higher eukaryotes where protein databases are large. The scoring methods of search programs often generate cases where incorrect peptide sequences score higher than correct peptide sequences (referred to as distraction). Because smaller databases yield less distraction and better discrimination between correct and incorrect assignments, we developed a method for editing a peptide-centric database (PC-DB) to remove unlikely sequences and strategies for enabling search programs to utilize this peptide database. Rules for unlikely missed cleavage and nontryptic proteolysis products were identified by data mining 11 849 high-confidence peptide assignments. We also evaluated ion exchange chromatographic behavior as an editing criterion to generate subset databases. When used to search a well-annotated test data set of MS/MS spectra, we found no loss of critical information using PC-DBs, validating the methods for generating and searching against the databases. On the other hand, improved confidence in peptide assignments was achieved for tryptic peptides, measured by changes in DeltaCN and RSP. Decreased distraction was also achieved, consistent with the 3-9-fold decrease in database size. Data mining identified a major class of common nonspecific proteolytic products corresponding to leucine aminopeptidase (LAP) cleavages. Large improvements in identifying LAP products were achieved using the PC-DB approach when compared with conventional searches against protein databases. These results demonstrate that peptide properties can be used to reduce database size, yielding improved accuracy and information capture due to reduced distraction, but with little loss of information compared to conventional protein database searches.


Subject(s)
Databases, Protein , Information Storage and Retrieval , Peptide Hydrolases/metabolism , Peptides/chemistry , Proteomics , Amino Acid Sequence , Humans , Hydrolysis , Mass Spectrometry , Molecular Sequence Data , Sensitivity and Specificity
8.
Mol Cell Proteomics ; 4(10): 1487-502, 2005 Oct.
Article in English | MEDLINE | ID: mdl-15979981

ABSTRACT

Measurements of mass spectral peak intensities and spectral counts are promising methods for quantifying protein abundance changes in shotgun proteomic analyses. We describe Serac, software developed to evaluate the ability of each method to quantify relative changes in protein abundance. Dynamic range and linearity using a three-dimensional ion trap were tested using standard proteins spiked into a complex sample. Linearity and good agreement between observed versus expected protein ratios were obtained after normalization and background subtraction of peak area intensity measurements and correction of spectral counts to eliminate discontinuity in ratio estimates. Peak intensity values useful for protein quantitation ranged from 10(7) to 10(11) counts with no obvious saturation effect, and proteins in replicate samples showed variations of less than 2-fold within the 95% range (+/-2sigma) when >or=3 peptides/protein were shared between samples. Protein ratios were determined with high confidence from spectral counts when maximum spectral counts were >or=4 spectra/protein, and replicates showed equivalent measurements well within 95% confidence limits. In further tests, complex samples were separated by gel exclusion chromatography, quantifying changes in protein abundance between different fractions. Linear behavior of peak area intensity measurements was obtained for peptides from proteins in different fractions. Protein ratios determined by spectral counting agreed well with those determined from peak area intensity measurements, and both agreed with independent measurements based on gel staining intensities. Overall spectral counting proved to be a more sensitive method for detecting proteins that undergo changes in abundance, whereas peak area intensity measurements yielded more accurate estimates of protein ratios. Finally these methods were used to analyze differential changes in protein expression in human erythroleukemia K562 cells stimulated under conditions that promote cell differentiation by mitogen-activated protein kinase pathway activation. Protein changes identified with p<0.1 showed good correlations with parallel measurements of changes in mRNA expression.


Subject(s)
Proteins/analysis , Proteomics/methods , Bias , Chromatography, Gel , Humans , K562 Cells , Mass Spectrometry , Peptides/analysis , RNA, Messenger/genetics , Sensitivity and Specificity , Software , Tetradecanoylphorbol Acetate/pharmacology , User-Computer Interface
9.
Anal Chem ; 76(13): 3556-68, 2004 Jul 01.
Article in English | MEDLINE | ID: mdl-15228325

ABSTRACT

Identifying proteins in cell extracts by shotgun proteomics involves digesting the proteins, sequencing the resulting peptides by data-dependent mass spectrometry (MS/MS), and searching protein databases to identify the proteins from which the peptides are derived. Manual analysis and direct spectral comparison reveal that scores from two commonly used search programs (Sequest and Mascot) validate less than half of potentially identifiable MS/MS spectra (class positive) from shotgun analyses of the human erythroleukemia K562 cell line. Here we demonstrate increased sensitivity and accuracy using a focused search strategy along with a peptide sequence validation script that does not rely exclusively on XCorr or Mowse scores generated by Sequest or Mascot, but uses consensus between the search programs, along with chemical properties and scores describing the nature of the fragmentation spectrum (ion score and RSP). The approach yielded 4.2% false positive and 8% false negative frequencies in peptide assignments. The protein profile is then assembled from peptide assignments using a novel peptide-centric protein nomenclature that more accurately reports protein variants that contain identical peptide sequences. An Isoform Resolver algorithm ensures that the protein count is not inflated by variants in the protein database, eliminating approximately 25% of redundant proteins. Analysis of soluble proteins from a human K562 cells identified 5130 unique proteins, with approximately 100 false positive protein assignments.


Subject(s)
Proteins/chemistry , Proteomics/methods , Cell Line, Tumor , Humans , K562 Cells , Mass Spectrometry/methods , Peptides/chemistry , Reproducibility of Results , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...