Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Structure ; 30(9): 1269-1284.e6, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-35716664

RESUMO

RING-between-RING (RBR) E3 ligases mediate ubiquitin transfer through an obligate E3-ubiquitin thioester intermediate prior to substrate ubiquitination. Although RBRs share a conserved catalytic module, substrate recruitment mechanisms remain enigmatic, and the relevant domains have yet to be identified for any member of the class. Here we characterize the interaction between the auto-inhibited RBR, HHARI (AriH1), and its target protein, 4EHP, using a combination of XL-MS, HDX-MS, NMR, and biochemical studies. The results show that (1) a di-aromatic surface on the catalytic HHARI Rcat domain forms a binding platform for substrates and (2) a phosphomimetic mutation on the auto-inhibitory Ariadne domain of HHARI promotes release and reorientation of Rcat for transthiolation and substrate modification. The findings identify a direct binding interaction between a RING-between-RING ligase and its substrate and suggest a general model for RBR substrate recognition.


Assuntos
Proteínas Culina , Ubiquitina , Domínio Catalítico , Proteínas Culina/metabolismo , Ubiquitina/metabolismo , Ubiquitina-Proteína Ligases/química , Ubiquitinação
2.
Cell ; 177(4): 1035-1049.e19, 2019 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-31031003

RESUMO

We performed the first proteogenomic study on a prospectively collected colon cancer cohort. Comparative proteomic and phosphoproteomic analysis of paired tumor and normal adjacent tissues produced a catalog of colon cancer-associated proteins and phosphosites, including known and putative new biomarkers, drug targets, and cancer/testis antigens. Proteogenomic integration not only prioritized genomically inferred targets, such as copy-number drivers and mutation-derived neoantigens, but also yielded novel findings. Phosphoproteomics data associated Rb phosphorylation with increased proliferation and decreased apoptosis in colon cancer, which explains why this classical tumor suppressor is amplified in colon tumors and suggests a rationale for targeting Rb phosphorylation in colon cancer. Proteomics identified an association between decreased CD8 T cell infiltration and increased glycolysis in microsatellite instability-high (MSI-H) tumors, suggesting glycolysis as a potential target to overcome the resistance of MSI-H tumors to immune checkpoint blockade. Proteogenomics presents new avenues for biological discoveries and therapeutic development.


Assuntos
Neoplasias do Colo/genética , Neoplasias do Colo/terapia , Proteogenômica/métodos , Apoptose/genética , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Linfócitos T CD8-Positivos , Proliferação de Células/genética , Neoplasias do Colo/metabolismo , Genômica/métodos , Glicólise , Humanos , Instabilidade de Microssatélites , Mutação , Fosforilação , Estudos Prospectivos , Proteômica/métodos , Proteína do Retinoblastoma/genética , Proteína do Retinoblastoma/metabolismo
3.
Mol Cell Proteomics ; 17(3): 422-430, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29222161

RESUMO

Alternative splicing dramatically increases transcriptome complexity but its contribution to proteome diversity remains controversial. Exon-exon junction spanning peptides provide direct evidence for the translation of specific splice isoforms and are critical for delineating protein isoform complexity. Here we found that junction-spanning peptides are underrepresented in publicly available mass spectrometry-based shotgun proteomics data sets. Further analysis showed that evolutionarily conserved preferential nucleotide usage at exon boundaries increases the occurrence of lysine- and arginine-coding triplets at the end of exons. Because both lysine and arginine residues are cleavage sites of trypsin, the nearly exclusive use of trypsin as the protein digestion enzyme in shotgun proteomic analyses hinders the detection of junction-spanning peptides. To study the impact of enzyme selection on splice junction detectability, we performed in-silico digestion of the human proteome using six proteases. The six enzymes created a total of 161,125 detectable junctions, and only 1,029 were common across all enzyme digestions. Chymotrypsin digestion provided the largest number of detectable junctions. Our experimental results further showed that combination of a chymotrypsin-based human proteome analysis with a trypsin-based analysis increased detection of junction-spanning peptides by 37% over the trypsin-only analysis and identified over a thousand junctions that were undetectable in fully tryptic digests. Our study demonstrates that detection of proteome diversity resulted from alternative splicing is limited by trypsin cleavage specificity, and that complementary digestion schemes will be essential to comprehensively analyze the translation of alternative splicing isoforms.


Assuntos
Processamento Alternativo , Peptídeo Hidrolases/química , Proteoma , Linhagem Celular Tumoral , Éxons , Humanos , Proteínas de Neoplasias/química , Neoplasias/metabolismo , Peptídeos/química
4.
Cancer Res ; 77(21): e43-e46, 2017 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-29092937

RESUMO

Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry-based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by nonexpert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: (i) generation of customized, annotated protein sequence databases from RNA-Seq data; and (ii) accurate matching of tandem mass spectrometry data to putative variants, followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub. Cancer Res; 77(21); e43-46. ©2017 AACR.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Neoplasias/genética , Software , Genoma Humano , Humanos , Proteômica/métodos , Espectrometria de Massas em Tandem , Transcriptoma/genética
5.
Gastroenterology ; 153(4): 1082-1095, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28625833

RESUMO

BACKGROUND AND AIMS: Proteomics holds promise for individualizing cancer treatment. We analyzed to what extent the proteomic landscape of human colorectal cancer (CRC) is maintained in established CRC cell lines and the utility of proteomics for predicting therapeutic responses. METHODS: Proteomic and transcriptomic analyses were performed on 44 CRC cell lines, compared against primary CRCs (n=95) and normal tissues (n=60), and integrated with genomic and drug sensitivity data. RESULTS: Cell lines mirrored the proteomic aberrations of primary tumors, in particular for intrinsic programs. Tumor relationships of protein expression with DNA copy number aberrations and signatures of post-transcriptional regulation were recapitulated in cell lines. The 5 proteomic subtypes previously identified in tumors were represented among cell lines. Nonetheless, systematic differences between cell line and tumor proteomes were apparent, attributable to stroma, extrinsic signaling, and growth conditions. Contribution of tumor stroma obscured signatures of DNA mismatch repair identified in cell lines with a hypermutation phenotype. Global proteomic data showed improved utility for predicting both known drug-target relationships and overall drug sensitivity as compared with genomic or transcriptomic measurements. Inhibition of targetable proteins associated with drug responses further identified corresponding synergistic or antagonistic drug combinations. Our data provide evidence for CRC proteomic subtype-specific drug responses. CONCLUSIONS: Proteomes of established CRC cell line are representative of primary tumors. Proteomic data tend to exhibit improved prediction of drug sensitivity as compared with genomic and transcriptomic profiles. Our integrative proteogenomic analysis highlights the potential of proteome profiling to inform personalized cancer medicine.


Assuntos
Antineoplásicos/farmacologia , Biomarcadores Tumorais/metabolismo , Neoplasias Colorretais/tratamento farmacológico , Neoplasias Colorretais/metabolismo , Proteínas de Neoplasias/metabolismo , Medicina de Precisão , Proteoma , Biomarcadores Tumorais/genética , Linhagem Celular Tumoral , Cromatografia Líquida , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Bases de Dados de Proteínas , Relação Dose-Resposta a Droga , Ensaios de Seleção de Medicamentos Antitumorais , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Mutação , Proteínas de Neoplasias/genética , Seleção de Pacientes , Polimorfismo de Nucleotídeo Único , Proteômica/métodos , Transdução de Sinais , Células Estromais/metabolismo , Espectrometria de Massas em Tandem , Transcriptoma , Microambiente Tumoral
6.
Anal Chem ; 88(11): 5733-41, 2016 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-27186799

RESUMO

Lipid identification from data produced with high-throughput technologies is essential to the elucidation of the roles played by lipids in cellular function and disease. Software tools for identifying lipids from tandem mass (MS/MS) spectra have been developed, but they are often costly or lack the sophistication of their proteomics counterparts. We have developed Greazy, an open source tool for the automated identification of phospholipids from MS/MS spectra, that utilizes methods similar to those developed for proteomics. From user-supplied parameters, Greazy builds a phospholipid search space and associated theoretical MS/MS spectra. Experimental spectra are scored against search space lipids with similar precursor masses using a peak score based on the hypergeometric distribution and an intensity score utilizing the percentage of total ion intensity residing in matching peaks. The LipidLama component filters the results via mixture modeling and density estimation. We assess Greazy's performance against the NIST 2014 metabolomics library, observing high accuracy in a search of multiple lipid classes. We compare Greazy/LipidLama against the commercial lipid identification software LipidSearch and show that the two platforms differ considerably in the sets of identified spectra while showing good agreement on those spectra identified by both. Lastly, we demonstrate the utility of Greazy/LipidLama with different instruments. We searched data from replicates of alveolar type 2 epithelial cells obtained with an Orbitrap and from human serum replicates generated on a quadrupole-time-of-flight (Q-TOF). These findings substantiate the application of proteomics derived methods to the identification of lipids. The software is available from the ProteoWizard repository: http://tiny.cc/bumbershoot-vc12-bin64 .


Assuntos
Automação , Fosfolipídeos/análise , Software , Algoritmos , Animais , Bases de Dados de Proteínas , Células Epiteliais/química , Humanos , Camundongos , Camundongos Endogâmicos C57BL , Espectrometria de Massas em Tandem
7.
Mol Cell Proteomics ; 15(3): 1164-75, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26657539

RESUMO

To facilitate genome-based representation and analysis of proteomics data, we developed a new bioinformatics framework, proBAMsuite, in which a central component is the protein BAM (proBAM) file format for organizing peptide spectrum matches (PSMs)(1) within the context of the genome. proBAMsuite also includes two R packages, proBAMr and proBAMtools, for generating and analyzing proBAM files, respectively. Applying proBAMsuite to three recently published proteomics datasets, we demonstrated its utility in facilitating efficient genome-based sharing, interpretation, and integration of proteomics data. First, the interpretation of proteomics data is significantly enhanced with the rich genomic annotation information. Second, PSMs can be easily reannotated using user-specified gene annotation schemes and assembled into both protein and gene identifications. Third, using the genome as a common reference, proBAMsuite facilitates seamless proteomics and proteogenomics data integration. Finally, proBAM files can be readily visualized in genome browsers and thus bring proteomics data analysis to a general audience beyond the proteomics community. Results from this study establish proBAMsuite as a useful bioinformatics framework for proteomics and proteogenomics research.


Assuntos
Proteína 11 Semelhante a Bcl-2/metabolismo , Biologia Computacional/métodos , Anotação de Sequência Molecular , Proteômica/métodos , Bases de Dados de Proteínas , Genoma Humano , Humanos , Peptídeos/química , Peptídeos/genética , Análise de Sequência de DNA/métodos , Navegador
8.
J Proteome Res ; 15(3): 691-706, 2016 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-26653538

RESUMO

The NCI Clinical Proteomic Tumor Analysis Consortium (CPTAC) employed a pair of reference xenograft proteomes for initial platform validation and ongoing quality control of its data collection for The Cancer Genome Atlas (TCGA) tumors. These two xenografts, representing basal and luminal-B human breast cancer, were fractionated and analyzed on six mass spectrometers in a total of 46 replicates divided between iTRAQ and label-free technologies, spanning a total of 1095 LC-MS/MS experiments. These data represent a unique opportunity to evaluate the stability of proteomic differentiation by mass spectrometry over many months of time for individual instruments or across instruments running dissimilar workflows. We evaluated iTRAQ reporter ions, label-free spectral counts, and label-free extracted ion chromatograms as strategies for data interpretation (source code is available from http://homepages.uc.edu/~wang2x7/Research.htm ). From these assessments, we found that differential genes from a single replicate were confirmed by other replicates on the same instrument from 61 to 93% of the time. When comparing across different instruments and quantitative technologies, using multiple replicates, differential genes were reproduced by other data sets from 67 to 99% of the time. Projecting gene differences to biological pathways and networks increased the degree of similarity. These overlaps send an encouraging message about the maturity of technologies for proteomic differentiation.


Assuntos
Xenoenxertos/química , Proteômica/métodos , Proteômica/normas , Neoplasias da Mama/química , Neoplasias da Mama/metabolismo , Cromatografia Líquida , Interpretação Estatística de Dados , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Redes e Vias Metabólicas , Variações Dependentes do Observador , Proteoma , Proteômica/instrumentação , Controle de Qualidade , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem/normas
9.
PLoS One ; 10(10): e0140263, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26466103

RESUMO

BACKGROUND: More than half of intravenous drug users (IDUs) in China suffer from the Hepatitis C virus (HCV). The virus is also more prevalent in non-injection drug users (NIDUs) than in the general population. However, not much is known about HCV subtype distribution in these populations. METHODS: Our research team conducted a cross-sectional study in four provinces in China. We sampled 825 IDUs and 244 NIDUs (1162 total), genotyped each DU's virus, and performed a phylogenetic analysis to differentiate HCV subtypes. RESULTS: Nucleic acid testing (NAT) determined that 82% percent (952/1162) of samples were HCV positive; we subtyped 90% (859/952) of these. We found multiple HCV subtypes: 3b (249, 29.0%), 3a (225, 26.2%), 6a (156, 18.2%), 1b (137, 15.9%), 6n (50, 5.9%), 1a (27, 3.1%), and 2a (15, 1.7%). An analysis of subtype distributions adjusted for province found statistically significant differences between HCV subtypes in IDUs and NIDUs. DISCUSSION: HCV subtypes 3b, 3a, 6a, and 1b were the most common in our study, together accounting for 89% of infections. The subtype distribution differences we found between IDUs and NIDUs suggested that sharing syringes was not the most likely pathway for HCV transmission in NIDUs. However, further studies are needed to elucidate how NIDUs were infected.


Assuntos
Usuários de Drogas , Genótipo , Hepacivirus/genética , Hepatite C/epidemiologia , Hepatite C/virologia , Abuso de Substâncias por Via Intravenosa/epidemiologia , Adulto , China/epidemiologia , Estudos Transversais , Feminino , Hepatite C/transmissão , Humanos , Masculino , Filogenia , RNA Viral , Proteínas não Estruturais Virais/genética
10.
Mol Cell Proteomics ; 14(12): 3299-309, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26435129

RESUMO

Questions concerning longitudinal data quality and reproducibility of proteomic laboratories spurred the Protein Research Group of the Association of Biomolecular Resource Facilities (ABRF-PRG) to design a study to systematically assess the reproducibility of proteomic laboratories over an extended period of time. Developed as an open study, initially 64 participants were recruited from the broader mass spectrometry community to analyze provided aliquots of a six bovine protein tryptic digest mixture every month for a period of nine months. Data were uploaded to a central repository, and the operators answered an accompanying survey. Ultimately, 45 laboratories submitted a minimum of eight LC-MSMS raw data files collected in data-dependent acquisition (DDA) mode. No standard operating procedures were enforced; rather the participants were encouraged to analyze the samples according to usual practices in the laboratory. Unlike previous studies, this investigation was not designed to compare laboratories or instrument configuration, but rather to assess the temporal intralaboratory reproducibility. The outcome of the study was reassuring with 80% of the participating laboratories performing analyses at a medium to high level of reproducibility and quality over the 9-month period. For the groups that had one or more outlying experiments, the major contributing factor that correlated to the survey data was the performance of preventative maintenance prior to the LC-MSMS analyses. Thus, the Protein Research Group of the Association of Biomolecular Resource Facilities recommends that laboratories closely scrutinize the quality control data following such events. Additionally, improved quality control recording is imperative. This longitudinal study provides evidence that mass spectrometry-based proteomics is reproducible. When quality control measures are strictly adhered to, such reproducibility is comparable among many disparate groups. Data from the study are available via ProteomeXchange under the accession code PXD002114.


Assuntos
Cromatografia Líquida/métodos , Peptídeos/isolamento & purificação , Proteínas/metabolismo , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Bovinos , Humanos , Laboratórios , Estudos Longitudinais , Proteínas/análise , Controle de Qualidade , Reprodutibilidade dos Testes , Inquéritos e Questionários
11.
Bioinformatics ; 31(23): 3838-40, 2015 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-26243018

RESUMO

MOTIVATION: Systematic bias in mass measurement adversely affects data quality and negates the advantages of high precision instruments. RESULTS: We introduce the mzRefinery tool for calibration of mass spectrometry data files. Using confident peptide spectrum matches, three different calibration methods are explored and the optimal transform function is chosen. After calibration, systematic bias is removed and the mass measurement errors are centered at 0 ppm. Because it is part of the ProteoWizard package, mzRefinery can read and write a wide variety of file formats. AVAILABILITY AND IMPLEMENTATION: The mzRefinery tool is part of msConvert, available with the ProteoWizard open source package at http://proteowizard.sourceforge.net/ CONTACT: samuel.payne@pnnl.gov. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Calibragem , Espectrometria de Massas/instrumentação , Espectrometria de Massas/métodos , Fragmentos de Peptídeos/análise , Proteínas/análise , Proteoma/análise , Software , Algoritmos , Cromatografia Líquida/métodos , Análise por Conglomerados , Humanos , Armazenamento e Recuperação da Informação , Proteômica/métodos
12.
Nature ; 513(7518): 382-7, 2014 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-25043054

RESUMO

Extensive genomic characterization of human cancers presents the problem of inference from genomic abnormalities to cancer phenotypes. To address this problem, we analysed proteomes of colon and rectal tumours characterized previously by The Cancer Genome Atlas (TCGA) and perform integrated proteogenomic analyses. Somatic variants displayed reduced protein abundance compared to germline variants. Messenger RNA transcript abundance did not reliably predict protein abundance differences between tumours. Proteomics identified five proteomic subtypes in the TCGA cohort, two of which overlapped with the TCGA 'microsatellite instability/CpG island methylation phenotype' transcriptomic subtype, but had distinct mutation, methylation and protein expression patterns associated with different clinical outcomes. Although copy number alterations showed strong cis- and trans-effects on mRNA abundance, relatively few of these extend to the protein level. Thus, proteomics data enabled prioritization of candidate driver genes. The chromosome 20q amplicon was associated with the largest global changes at both mRNA and protein levels; proteomics data highlighted potential 20q candidates, including HNF4A (hepatocyte nuclear factor 4, alpha), TOMM34 (translocase of outer mitochondrial membrane 34) and SRC (SRC proto-oncogene, non-receptor tyrosine kinase). Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology.


Assuntos
Neoplasias do Colo/genética , Neoplasias do Colo/metabolismo , Genômica , Proteoma/metabolismo , Neoplasias Retais/genética , Neoplasias Retais/metabolismo , Transcriptoma/genética , Cromossomos Humanos Par 20/genética , Ilhas de CpG/genética , Variações do Número de Cópias de DNA/genética , Metilação de DNA , Fator 4 Nuclear de Hepatócito/genética , Humanos , Repetições de Microssatélites/genética , Proteínas de Transporte da Membrana Mitocondrial/genética , Proteínas do Complexo de Importação de Proteína Precursora Mitocondrial , Mutação de Sentido Incorreto/genética , Proteínas de Neoplasias/análise , Proteínas de Neoplasias/genética , Proteínas de Neoplasias/metabolismo , Mutação Puntual/genética , Proteoma/análise , Proteoma/genética , Proteômica , Proto-Oncogene Mas , Proteínas Proto-Oncogênicas pp60(c-src)/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Neoplásico/análise , RNA Neoplásico/genética , RNA Neoplásico/metabolismo
13.
Anal Chem ; 86(5): 2497-509, 2014 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-24494671

RESUMO

Shotgun proteomics experiments integrate a complex sequence of processes, any of which can introduce variability. Quality metrics computed from LC-MS/MS data have relied upon identifying MS/MS scans, but a new mode for the QuaMeter software produces metrics that are independent of identifications. Rather than evaluating each metric independently, we have created a robust multivariate statistical toolkit that accommodates the correlation structure of these metrics and allows for hierarchical relationships among data sets. The framework enables visualization and structural assessment of variability. Study 1 for the Clinical Proteomics Technology Assessment for Cancer (CPTAC), which analyzed three replicates of two common samples at each of two time points among 23 mass spectrometers in nine laboratories, provided the data to demonstrate this framework, and CPTAC Study 5 provided data from complex lysates under Standard Operating Procedures (SOPs) to complement these findings. Identification-independent quality metrics enabled the differentiation of sites and run-times through robust principal components analysis and subsequent factor analysis. Dissimilarity metrics revealed outliers in performance, and a nested ANOVA model revealed the extent to which all metrics or individual metrics were impacted by mass spectrometer and run time. Study 5 data revealed that even when SOPs have been applied, instrument-dependent variability remains prominent, although it may be reduced, while within-site variability is reduced significantly. Finally, identification-independent quality metrics were shown to be predictive of identification sensitivity in these data sets. QuaMeter and the associated multivariate framework are available from http://fenchurch.mc.vanderbilt.edu and http://homepages.uc.edu/~wang2x7/ , respectively.


Assuntos
Cromatografia Líquida/métodos , Controle de Qualidade , Espectrometria de Massas em Tandem/métodos , Análise de Variância , Humanos , Análise Multivariada , Proteínas de Neoplasias/metabolismo , Neoplasias/metabolismo , Reprodutibilidade dos Testes
14.
Mol Cell Proteomics ; 13(1): 360-71, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24187338

RESUMO

The proteome informatics research group of the Association of Biomolecular Resource Facilities conducted a study to assess the community's ability to detect and characterize peptides bearing a range of biologically occurring post-translational modifications when present in a complex peptide background. A data set derived from a mixture of synthetic peptides with biologically occurring modifications combined with a yeast whole cell lysate as background was distributed to a large group of researchers and their results were collectively analyzed. The results from the twenty-four participants, who represented a broad spectrum of experience levels with this type of data analysis, produced several important observations. First, there is significantly more variability in the ability to assess whether a results is significant than there is to determine the correct answer. Second, labile post-translational modifications, particularly tyrosine sulfation, present a challenge for most researchers. Finally, for modification site localization there are many tools being employed, but researchers are currently unsure of the reliability of the results these programs are producing.


Assuntos
Peptídeos/isolamento & purificação , Processamento de Proteína Pós-Traducional/genética , Proteoma , Sequência de Aminoácidos/genética , Misturas Complexas/química , Misturas Complexas/genética , Biologia Computacional , Humanos , Peptídeos/química , Peptídeos/metabolismo , Análise de Sequência de Proteína
15.
J Proteome Res ; 12(9): 4111-21, 2013 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-23879310

RESUMO

Differentiating and quantifying protein differences in complex samples produces significant challenges in sensitivity and specificity. Label-free quantification can draw from two different information sources: precursor intensities and spectral counts. Intensities are accurate for calculating protein relative abundance, but values are often missing due to peptides that are identified sporadically. Spectral counting can reliably reproduce difference lists, but differentiating peptides or quantifying all but the most concentrated protein changes is usually beyond its abilities. Here we developed new software, IDPQuantify, to align multiple replicates using principal component analysis, extract accurate precursor intensities from MS data, and combine intensities with spectral counts for significant gains in differentiation and quantification. We have applied IDPQuantify to three comparative proteomic data sets featuring gold standard protein differences spiked in complicated backgrounds. The software is able to associate peptides with peaks that are otherwise left unidentified to increase the efficiency of protein quantification, especially for low-abundance proteins. By combing intensities with spectral counts from IDPicker, it gains an average of 30% more true positive differences among top differential proteins. IDPQuantify quantifies protein relative abundance accurately in these test data sets to produce good correlations between known and measured concentrations.


Assuntos
Mapeamento de Peptídeos/métodos , Proteoma/química , Software , Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Humanos , Mapeamento de Peptídeos/normas , Análise de Componente Principal , Proteoma/metabolismo , Proteômica , Padrões de Referência , Sensibilidade e Especificidade , Espectrometria de Massas em Tandem/normas , Leveduras
16.
Genomics Proteomics Bioinformatics ; 11(2): 86-95, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23499924

RESUMO

In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of charged peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Modelos Químicos , Fragmentos de Peptídeos/química , Peptídeos/análise , Sequência de Aminoácidos , Animais , Eletroquímica , Humanos , Armazenamento e Recuperação da Informação/métodos , Peptídeos/química , Precursores de Proteínas/química , Software , Espectrometria de Massas em Tandem/métodos
18.
Anal Chem ; 84(14): 5845-50, 2012 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-22697456

RESUMO

LC-MS/MS-based proteomics studies rely on stable analytical system performance that can be evaluated by objective criteria. The National Institute of Standards and Technology (NIST) introduced the MSQC software to compute diverse metrics from experimental LC-MS/MS data, enabling quality analysis and quality control (QA/QC) of proteomics instrumentation. In practice, however, several attributes of the MSQC software prevent its use for routine instrument monitoring. Here, we present QuaMeter, an open-source tool that improves MSQC in several aspects. QuaMeter can directly read raw data from instruments manufactured by different vendors. The software can work with a wide variety of peptide identification software for improved reliability and flexibility. Finally, QC metrics implemented in QuaMeter are rigorously defined and tested. The source code and binary versions of QuaMeter are available under Apache 2.0 License at http://fenchurch.mc.vanderbilt.edu.


Assuntos
Cromatografia Líquida/instrumentação , Proteômica/instrumentação , Espectrometria de Massas em Tandem/instrumentação , Peptídeos/análise , Software , Fatores de Tempo
19.
J Proteome Res ; 11(3): 1686-95, 2012 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-22217208

RESUMO

Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines.


Assuntos
Algoritmos , Mapeamento de Peptídeos/métodos , Ferramenta de Busca , Software , Proteínas Sanguíneas/química , Linhagem Celular , Bases de Dados de Proteínas , Humanos , Modelos Estatísticos , Redes Neurais de Computação , Mapeamento de Peptídeos/normas , Proteoma/química , Proteoma/genética , Proteoma/metabolismo , Padrões de Referência , Análise de Sequência de Proteína/métodos , Soroalbumina Bovina/química , Espectrometria de Massas em Tandem/métodos , Espectrometria de Massas em Tandem/normas
20.
Bioinformatics ; 27(22): 3214-5, 2011 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-21965817

RESUMO

SUMMARY: The large amount of data produced by proteomics experiments requires effective bioinformatics tools for the integration of data management and data analysis. Here we introduce a suite of tools developed at Vanderbilt University to support production proteomics. We present the Backup Utility Service tool for automated instrument file backup and the ScanSifter tool for data conversion. We also describe a queuing system to coordinate identification pipelines and the File Collector tool for batch copying analytical results. These tools are individually useful but collectively reinforce each other. They are particularly valuable for proteomics core facilities or research institutions that need to manage multiple mass spectrometers. With minor changes, they could support other types of biomolecular resource facilities.


Assuntos
Proteômica/métodos , Software , Espectrometria de Massas , Proteoma/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...