Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
Add more filters










Publication year range
1.
Anal Bioanal Chem ; 2024 May 15.
Article in English | MEDLINE | ID: mdl-38744720

ABSTRACT

Advances in high-throughput high-resolution mass spectrometry and the development of thermal proteome profiling approach (TPP) have made it possible to accelerate a drug target search. Since its introduction in 2014, TPP quickly became a method of choice in chemical proteomics for identifying drug-to-protein interactions on a proteome-wide scale and mapping the pathways of these interactions, thus further elucidating the unknown mechanisms of action of a drug under study. However, the current TPP implementations based on tandem mass spectrometry (MS/MS), associated with employing lengthy peptide separation protocols and expensive labeling techniques for sample multiplexing, limit the scaling of this approach for the ever growing variety of drug-to-proteomes. A variety of ultrafast proteomics methods have been developed in the last couple of years. Among them, DirectMS1 provides MS/MS-free quantitative proteome-wide analysis in 5-min time scale, thus opening the way for sample-hungry applications, such as TPP. In this work, we demonstrate the first implementation of the TPP approach using the ultrafast proteome-wide analysis based on DirectMS1. Using a drug topotecan, which is a known topoisomerase I (TOP1) inhibitor, the feasibility of the method for identifying drug targets at the whole proteome level was demonstrated for an ovarian cancer cell line.

2.
Proteomics ; 24(1-2): e2300090, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37496303

ABSTRACT

The coefficient of variation (CV) is often used in proteomics as a proxy to characterize the performance of a quantitation method and/or the related software. In this note, we question the excessive reliance on this metric in quantitative proteomics that may result in erroneous conclusions. We support this note using a ground-truth Human-Yeast-E. coli dataset demonstrating in a number of cases that erroneous data processing methods may lead to a low CV which has nothing to do with these methods' performances in quantitation.


Subject(s)
Escherichia coli , Proteomics , Humans , Mass Spectrometry/methods , Proteomics/methods , Software , Saccharomyces cerevisiae
3.
J Proteome Res ; 22(9): 2827-2835, 2023 09 01.
Article in English | MEDLINE | ID: mdl-37579078

ABSTRACT

One of the key steps in data dependent acquisition (DDA) proteomics is detection of peptide isotopic clusters, also called "features", in MS1 spectra and matching them to MS/MS-based peptide identifications. A number of peptide feature detection tools became available in recent years, each relying on its own matching algorithm. Here, we provide an integrated solution, the intensity-based Quantitative Mix and Match Approach (IQMMA), which integrates a number of untargeted peptide feature detection algorithms and returns the most probable intensity values for the MS/MS-based identifications. IQMMA was tested using available proteomic data acquired for both well-characterized (ground truth) and real-world biological samples, including a mix of Yeast and E. coli digests spiked at different concentrations into the Human K562 digest used as a background, and a set of glioblastoma cell lines. Three open-source feature detection algorithms were integrated: Dinosaur, biosaur2, and OpenMS FeatureFinder. None of them was found optimal when applied individually to all the data sets employed in this work; however, their combined use in IQMMA improved efficiency of subsequent protein quantitation. The software implementing IQMMA is freely available at https://github.com/PostoenkoVI/IQMMA under Apache 2.0 license.


Subject(s)
Proteomics , Tandem Mass Spectrometry , Humans , Escherichia coli , Algorithms , Peptides/chemistry , Software
4.
J Proteome Res ; 22(8): 2734-2742, 2023 08 04.
Article in English | MEDLINE | ID: mdl-37395192

ABSTRACT

Current proteomics approaches rely almost exclusively on using the positive ionization mode, resulting in inefficient ionization of many acidic peptides. This study investigates protein identification efficiency in the negative ionization mode using the DirectMS1 method. DirectMS1 is an ultrafast data acquisition method based on accurate peptide mass measurements and predicted retention times. Our method achieves the highest rate of protein identification in the negative ion mode to date, identifying over 1000 proteins in a human cell line at a 1% false discovery rate. This is accomplished using a single-shot 10 min separation gradient, comparable to lengthy MS/MS-based analyses. Optimizing separation and experimental conditions was achieved by utilizing mobile buffers containing 2.5 mM imidazole and 3% isopropanol. The study emphasized the complementary nature of data obtained in positive and negative ion modes. Combining the results from all replicates in both polarities increased the number of identified proteins to 1774. Additionally, we analyzed the method's efficiency using different proteases for protein digestion. Among the four studied proteases (LysC, GluC, AspN, and trypsin), trypsin and LysC demonstrated the highest protein identification yield. This suggests that digestion procedures utilized in positive-mode proteomics can be effectively applied in the negative ion mode. Data are deposited to ProteomeXchange: PXD040583.


Subject(s)
Proteomics , Tandem Mass Spectrometry , Humans , Tandem Mass Spectrometry/methods , Trypsin , Proteomics/methods , Peptides/analysis , Proteins , Peptide Hydrolases/metabolism
5.
J Proteome Res ; 22(6): 1695-1711, 2023 06 02.
Article in English | MEDLINE | ID: mdl-37158322

ABSTRACT

The proteogenomic search pipeline developed in this work has been applied for reanalysis of 40 publicly available shotgun proteomic datasets from various human tissues comprising more than 8000 individual LC-MS/MS runs, of which 5442 .raw data files were processed in total. This reanalysis was focused on searching for ADAR-mediated RNA editing events, their clustering across samples of different origins, and classification. In total, 33 recoded protein sites were identified in 21 datasets. Of those, 18 sites were detected in at least two datasets, representing the core human protein editome. In agreement with prior artworks, neural and cancer tissues were found to be enriched with recoded proteins. Quantitative analysis indicated that recoding the rate of specific sites did not directly depend on the levels of ADAR enzymes or targeted proteins themselves, rather it was governed by differential and yet undescribed regulation of interaction of enzymes with mRNA. Nine recoding sites conservative between humans and rodents were validated by targeted proteomics using stable isotope standards in the murine brain cortex and cerebellum, and an additional one was validated in human cerebrospinal fluid. In addition to previous data of the same type from cancer proteomes, we provide a comprehensive catalog of recoding events caused by ADAR RNA editing in the human proteome.


Subject(s)
Proteogenomics , Proteomics , Humans , Animals , Mice , RNA/metabolism , RNA Editing , Chromatography, Liquid , Tandem Mass Spectrometry , Proteome/genetics , Proteome/metabolism , Adenosine/metabolism , Inosine/genetics , Inosine/metabolism
6.
Biochemistry (Mosc) ; 87(11): 1342-1353, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36509723

ABSTRACT

Protein quantitation in tissue cells or physiological fluids based on liquid chromatography/mass spectrometry is one of the key sources of information on the mechanisms of cell functioning during chemotherapeutic treatment. Information on significant changes in protein expression upon treatment can be obtained by chemical proteomics and requires analysis of the cellular proteomes, as well as development of experimental and bioinformatic methods for identification of the drug targets. Low throughput of whole proteome analysis based on liquid chromatography and tandem mass spectrometry is one of the main factors limiting the scale of these studies. The method of direct mass spectrometric identification of proteins, DirectMS1, is one of the approaches developed in recent years allowing ultrafast proteome-wide analyses employing minute-scale gradients for separation of proteolytic mixtures. Aim of this work was evaluation of both possibilities and limitations of the method for identification of drug targets at the level of whole proteome and for revealing cellular processes activated by the treatment. Particularly, the available literature data on chemical proteomics obtained earlier for a large set of onco-pharmaceuticals using multiplex quantitative proteome profiling were analyzed. The results obtained were further compared with the proteome-wide data acquired by the DirectMS1 method using ultrashort separation gradients to evaluate efficiency of the method in identifying known drug targets. Using ovarian cancer cell line A2780 as an example, a whole-proteome comparison of two cell lysis techniques was performed, including the freeze-thaw lysis commonly employed in chemical proteomics and the one based on ultrasonication for cell disruption, which is the widely accepted as a standard in proteomic studies. Also, the proteome-wide profiling was performed using ultrafast DirectMS1 method for A2780 cell line treated with lonidamine, followed by gene ontology analyses to evaluate capabilities of the method in revealing regulation of proteins in the cellular processes associated with drug treatment.


Subject(s)
Ovarian Neoplasms , Proteome , Humans , Female , Proteome/metabolism , Proteomics/methods , Cell Line, Tumor , Ovarian Neoplasms/drug therapy , Tandem Mass Spectrometry
7.
Anal Chem ; 94(38): 13068-13075, 2022 09 27.
Article in English | MEDLINE | ID: mdl-36094425

ABSTRACT

Recently, we presented the DirectMS1 method of ultrafast proteome-wide analysis based on minute-long LC gradients and MS1-only mass spectra acquisition. Currently, the method provides the depth of human cell proteome coverage of 2500 proteins at a 1% false discovery rate (FDR) when using 5 min LC gradients and 7.3 min runtime in total. While the standard MS/MS approaches provide 4000-5000 protein identifications within a couple of hours of instrumentation time, we advocate here that the higher number of identified proteins does not always translate into better quantitation quality of the proteome analysis. To further elaborate on this issue, we performed a one-on-one comparison of quantitation results obtained using DirectMS1 with three popular MS/MS-based quantitation methods: label-free (LFQ) and tandem mass tag quantitation (TMT), both based on data-dependent acquisition (DDA) and data-independent acquisition (DIA). For comparison, we performed a series of proteome-wide analyses of well-characterized (ground truth) and biologically relevant samples, including a mix of UPS1 proteins spiked at different concentrations into an Echerichia coli digest used as a background and a set of glioblastoma cell lines. MS1-only data was analyzed using a novel quantitation workflow called DirectMS1Quant developed in this work. The results obtained in this study demonstrated comparable quantitation efficiency of 5 min DirectMS1 with both TMT and DIA methods, yet the latter two utilized a 10-20-fold longer instrumentation time.


Subject(s)
Proteome , Proteomics , Chromatography, Liquid/methods , Humans , Proteome/analysis , Proteomics/methods , Tandem Mass Spectrometry/methods , Workflow
8.
J Proteome Res ; 21(6): 1566-1574, 2022 06 03.
Article in English | MEDLINE | ID: mdl-35549218

ABSTRACT

Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.


Subject(s)
Proteomics , Tandem Mass Spectrometry , Algorithms , Cluster Analysis , Consensus , Databases, Protein , Proteomics/methods , Software , Tandem Mass Spectrometry/methods
9.
Int J Mol Sci ; 23(9)2022 May 08.
Article in English | MEDLINE | ID: mdl-35563635

ABSTRACT

Cancer cell lines responded differentially to type I interferon treatment in models of oncolytic therapy using vesicular stomatitis virus (VSV). Two opposite cases were considered in this study, glioblastoma DBTRG-05MG and osteosarcoma HOS cell lines exhibiting resistance and sensitivity to VSV after the treatment, respectively. Type I interferon responses were compared for these cell lines by integrative analysis of the transcriptome, proteome, and RNA editome to identify molecular factors determining differential effects observed. Adenosine-to-inosine RNA editing was equally induced in both cell lines. However, transcriptome analysis showed that the number of differentially expressed genes was much higher in DBTRG-05MG with a specific enrichment in inflammatory proteins. Further, it was found that two genes, EGFR and HER2, were overexpressed in HOS cells compared with DBTRG-05MG, supporting recent reports that EGF receptor signaling attenuates interferon responses via HER2 co-receptor activity. Accordingly, combined treatment of cells with EGF receptor inhibitors such as gefitinib and type I interferon increases the resistance of sensitive cell lines to VSV. Moreover, sensitive cell lines had increased levels of HER2 protein compared with non-sensitive DBTRG-05MG. Presumably, the level of this protein expression in tumor cells might be a predictive biomarker of their resistance to oncolytic viral therapy.


Subject(s)
Interferon Type I , Oncolytic Virotherapy , Oncolytic Viruses , Vesicular Stomatitis , Animals , Cell Line, Tumor , ErbB Receptors/genetics , Interferon Type I/metabolism , Oncolytic Viruses/physiology , Vesicular stomatitis Indiana virus/genetics , Vesiculovirus/physiology
10.
J Proteome Res ; 21(6): 1438-1448, 2022 06 03.
Article in English | MEDLINE | ID: mdl-35536917

ABSTRACT

Mass spectrometry-based proteome analysis implies matching the mass spectra of proteolytic peptides to amino acid sequences predicted from genomic sequences. Reliability of peptide variant identification in proteogenomic studies is often lacking. We propose a way to interpret shotgun proteomics results, specifically in the data-dependent acquisition mode, as protein sequence coverage by multiple reads as it is done in nucleic acid sequencing for calling of single nucleotide variants. Multiple reads for each sequence position could be provided by overlapping distinct peptides, thus confirming the presence of certain amino acid residues in the overlapping stretch with a lower false discovery rate. Overlapping distinct peptides originate from miscleaved tryptic peptides in combination with their properly cleaved counterparts and from peptides generated by multiple proteases after the same specimen is subject to parallel digestion and analyzed separately. We illustrate this approach using publicly available multiprotease data sets and our own data generated for the HEK-293 cell line digests obtained using trypsin, LysC, and GluC proteases. Totally, up to 30% of the whole proteome was covered by tryptic peptides with up to 7% covered twofold and more. The proteogenomic analysis of the HEK-293 cell line revealed 36 single amino acid variants, seven of which were supported by multiple reads.


Subject(s)
Proteogenomics , Amino Acids , HEK293 Cells , Humans , Peptide Hydrolases , Peptides/analysis , Proteogenomics/methods , Proteome/analysis , Reproducibility of Results
11.
J Am Soc Mass Spectrom ; 32(5): 1258-1262, 2021 May 05.
Article in English | MEDLINE | ID: mdl-33900766

ABSTRACT

Protein inference is one of the crucial steps in proteome characterization using a bottom-up approach. Multiple algorithms to solve the problem are focused on extensive analysis of shared peptides identified from fragmentation mass spectra (MS/MS). However, many protein homologues with a similar amino acid sequence typically have identical lists of identified peptides due to the problem of proteome undersampling in a bottom-up approach and, thus, cannot be distinguished by existing protein inference methods. Here, we propose the use of peptide feature information extracted from precursor mass spectra to assist in identification of proteins otherwise indistinguishable from MS/MS. The proposed method was integrated with a protein inference algorithm based on the parsimony principle and built-in in the postsearch utility Scavager. The results demonstrate increasing accuracy and efficiency of homologous protein identifications for the well characterized data sets including the one with known protein sequences from iPRG-2016 study.


Subject(s)
Algorithms , Proteins/chemistry , Proteomics/methods , Tandem Mass Spectrometry/methods , Databases, Protein , HeLa Cells , Humans , Peptides/chemistry
12.
J Proteome Res ; 20(4): 1864-1873, 2021 04 02.
Article in English | MEDLINE | ID: mdl-33720732

ABSTRACT

Proteome-wide analyses rely on tandem mass spectrometry and the extensive separation of proteolytic mixtures. This imposes considerable instrumental time consumption, which is one of the main obstacles in the broader acceptance of proteomics in biomedical and clinical research. Recently, we presented a fast proteomic method termed DirectMS1 based on ultrashort LC gradients as well as MS1-only mass spectra acquisition and data processing. The method allows significant reduction of the proteome-wide analysis time to a few minutes at the depth of quantitative proteome coverage of 1000 proteins at 1% false discovery rate (FDR). In this work, to further increase the capabilities of the DirectMS1 method, we explored the opportunities presented by the recent progress in the machine-learning area and applied the LightGBM decision tree boosting algorithm to the scoring of peptide feature matches when processing MS1 spectra. Furthermore, we integrated the peptide feature identification algorithm of DirectMS1 with the recently introduced peptide retention time prediction utility, DeepLC. Additional approaches to improve the performance of the DirectMS1 method are discussed and demonstrated, such as using FAIMS for gas-phase ion separation. As a result of all improvements to DirectMS1, we succeeded in identifying more than 2000 proteins at 1% FDR from the HeLa cell line in a 5 min gradient LC-FAIMS/MS1 analysis. The data sets generated and analyzed during the current study have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD023977.


Subject(s)
Proteome , Proteomics , Chromatography, High Pressure Liquid , HeLa Cells , Humans , Machine Learning
13.
Rapid Commun Mass Spectrom ; : e9045, 2021 Jan 15.
Article in English | MEDLINE | ID: mdl-33450063

ABSTRACT

RATIONALE: One of the important steps in initial data processing of peptide mass spectra is the detection of peptide features in full-range mass spectra. Ion mobility offers advantages over previous methods performing this detection by providing an additional structure-specific separation dimension. However, there is a lack of open-source software that utilizes these advantages and detects peptide features in mass spectra acquired along with ion mobility data using new instruments such as timsTOF and/or FAIMS-Orbitrap. METHODS: Recently, a utility called Dinosaur was presented, which provides an efficient way for feature detection in peptide ion mass spectra. In this work we extended its functionality by developing Biosaur software to fully employ the additional information provided by ion mobility data. Biosaur was developed using the Python 3.8 programming language. RESULTS: Biosaur supports the processing of data acquired using mass spectrometers with ion mobility capabilities, specifically timsTOF and FAIMS. In addition, it processes mass spectra obtained in negative ion mode and reports cosine correlation table for peptide features which is useful for differentiation between in-source fragments and semi-tryptic peptides. CONCLUSIONS: Biosaur is a utility for detecting peptide features in liquid chromatography-mass spectra with ion mobility and negative ion supports. The software is distributed with an open-source APACHE 2.0 license and is freely available on Github: https://github.com/abdrakhimov1/Biosaur.

14.
J Proteome Res ; 19(10): 4046-4060, 2020 10 02.
Article in English | MEDLINE | ID: mdl-32866021

ABSTRACT

Adenosine-to-inosine RNA editing is an enzymatic post-transcriptional modification which modulates immunity and neural transmission in multicellular organisms. In particular, it involves editing of mRNA codons with the resulting amino acid substitutions. We identified such sites for developmental proteomes of Drosophila melanogaster at the protein level using available data for 15 stages of fruit fly development from egg to imago and 14 time points of embryogenesis. In total, 40 sites were obtained, each belonging to a unique protein, including four sites related to embryogenesis. The interactome analysis has revealed that the majority of the editing-recoded proteins were associated with synaptic vesicle trafficking and actomyosin organization. Quantitation data analysis suggested the existence of a phase-specific RNA editing regulation with yet unknown mechanisms. These findings supported the transcriptome analysis results, which showed that a burst in the RNA editing occurs during insect metamorphosis from pupa to imago. Finally, targeted proteomic analysis was performed to quantify editing-recoded and genomically encoded versions of five proteins in brains of larvae, pupae, and imago insects, which showed a clear tendency toward an increase in the editing rate for each of them. These results will allow a better understanding of the protein role in physiological effects of RNA editing.


Subject(s)
Drosophila Proteins , RNA Editing , Adenosine Deaminase/genetics , Adenosine Deaminase/metabolism , Animals , Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Inosine/metabolism , Proteome/genetics , Proteome/metabolism , Proteomics , RNA, Messenger/genetics
15.
Anal Chem ; 92(6): 4326-4333, 2020 03 17.
Article in English | MEDLINE | ID: mdl-32077687

ABSTRACT

Proteome characterization relies heavily on tandem mass spectrometry (MS/MS) and is thus associated with instrumentation complexity, lengthy analysis time, and limited duty cycle. It was always tempting to implement approaches that do not require MS/MS, yet they were constantly failing to achieve a meaningful depth of quantitative proteome coverage within short experimental times, which is particularly important for clinical or biomarker-discovery applications. Here, we report on the first successful attempt to develop a truly MS/MS-free method, DirectMS1, for bottom-up proteomics. The method is compared with the standard MS/MS-based data-dependent acquisition approach for proteome-wide analysis using 5 min LC gradients. Specifically, we demonstrate identification of 1 000 protein groups for a standard HeLa cell line digest. The amount of loaded sample was varied in a range from 1 to 500 ng, and the method demonstrated 10-fold higher sensitivity. Combined with the recently introduced Diffacto approach for relative protein quantification, DirectMS1 outperforms most popular MS/MS-based label-free quantitation approaches because of significantly higher protein sequence coverage.


Subject(s)
Neoplasm Proteins/analysis , Proteome/analysis , Proteomics , Saccharomyces cerevisiae Proteins/analysis , HeLa Cells , Humans , Tandem Mass Spectrometry , Time Factors
16.
Proteomics ; 19(23): e1900195, 2019 12.
Article in English | MEDLINE | ID: mdl-31576663

ABSTRACT

Proteogenomics is based on the use of customized genome or RNA sequencing databases for interrogation of shotgun proteomics data in search for proteome-level evidence of genome variations or RNA editing. In this work, the products of adenosine-to-inosine RNA editing in human and murine brain proteomes are identified using publicly available brain proteome LC-MS/MS datasets and an RNA editome database compiled from several sources. After filtering of false-positive results, 20 and 37 sites of editing in proteins belonging to 14 and 32 genes are identified for murine and human brain proteomes, respectively. Eight sites of editing identified with high spectral counts overlapped between human and mouse brain samples. Some of these sites have been previously reported using orthogonal methods, such as α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) glutamate receptors, CYFIP2, coatomer alpha. Also, differential editing between neurons and microglia is demonstrated in this work for some of the proteins from primary murine brain cell cultures. Because many edited sites are still not characterized functionally at the protein level, the results provide a necessary background for their further analysis in normal and diseased cells and tissues using targeted proteomic approaches.


Subject(s)
Adenosine/metabolism , Brain/metabolism , Inosine/metabolism , RNA Editing/genetics , Adaptor Proteins, Signal Transducing/metabolism , Animals , Cells, Cultured , Coatomer Protein/metabolism , Humans , Mice , Proteome/metabolism , Proteomics/methods
17.
Proteomics ; 19(3): e1800280, 2019 02.
Article in English | MEDLINE | ID: mdl-30537264

ABSTRACT

Shotgun proteomics workflows for database protein identification typically include a combination of search engines and postsearch validation software based mostly on machine learning algorithms. Here, a new postsearch validation tool called Scavager employing CatBoost, an open-source gradient boosting library, which shows improved efficiency compared with the other popular algorithms, such as Percolator, PeptideProphet, and Q-ranker, is presented. The comparison is done using multiple data sets and search engines, including MSGF+, MSFragger, X!Tandem, Comet, and recently introduced IdentiPy. Implemented in Python programming language, Scavager is open-source and freely available at https://bitbucket.org/markmipt/scavager.


Subject(s)
Algorithms , Proteomics/methods , Databases, Protein , HEK293 Cells , HeLa Cells , Humans , Machine Learning , Programming Languages , Search Engine , Software
18.
J Proteome Res ; 18(2): 709-714, 2019 02 01.
Article in English | MEDLINE | ID: mdl-30576148

ABSTRACT

Many of the novel ideas that drive today's proteomic technologies are focused essentially on experimental or data-processing workflows. The latter are implemented and published in a number of ways, from custom scripts and programs, to projects built using general-purpose or specialized workflow engines; a large part of routine data processing is performed manually or with custom scripts that remain unpublished. Facilitating the development of reproducible data-processing workflows becomes essential for increasing the efficiency of proteomic research. To assist in overcoming the bioinformatics challenges in the daily practice of proteomic laboratories, 5 years ago we developed and announced Pyteomics, a freely available open-source library providing Python interfaces to proteomic data. We summarize the new functionality of Pyteomics developed during the time since its introduction.


Subject(s)
Proteomics/methods , Software , User-Computer Interface , Computational Biology , Workflow
19.
Proteomics ; 18(23): e1800117, 2018 12.
Article in English | MEDLINE | ID: mdl-30307114

ABSTRACT

The efficiency of proteome analysis depends strongly on the configuration parameters of the search engine. One of the murkiest and nontrivial among them is the list of amino acid modifications included for the search. Here, an approach called AA_stat is presented for uncovering the unexpected modifications of amino acid residues in the protein sequences, as well as possible artifacts of data acquisition or processing, in the results of proteome analyses. The approach is based on comparing the amino acid frequencies of different mass shifts observed using the open search method introduced recently. In this work, the proposed approach is applied to publicly available proteomic data is applied and its feasibility for discovering unaccounted modifications or possible pitfalls of the identification workflow is demonstrated. The algorithm is implemented in Python as an open-source command-line tool available at https://bitbucket.org/J_Bale/aa_stat/.


Subject(s)
Amino Acids/analysis , Peptides/analysis , Proteomics/methods , Algorithms
20.
J Proteome Res ; 17(11): 3889-3903, 2018 11 02.
Article in English | MEDLINE | ID: mdl-30298734

ABSTRACT

Adenosine-to-inosine RNA editing is one of the most common types of RNA editing, a posttranscriptional modification made by special enzymes. We present a proteomic study on this phenomenon for Drosophila melanogaster. Three proteome data sets were used in the study: two taken from public repository and the third one obtained here. A customized protein sequence database was generated using results of genome-wide adenosine-to-inosine RNA studies and applied for identifying the edited proteins. The total number of 68 edited peptides belonging to 59 proteins was identified in all data sets. Eight of them being shared between the whole insect, head, and brain proteomes. Seven edited sites belonging to synaptic vesicle and membrane trafficking proteins were selected for validation by orthogonal analysis by Multiple Reaction Monitoring. Five editing events in cpx, Syx1A, Cadps, CG4587, and EndoA were validated in fruit fly brain tissue at the proteome level using isotopically labeled standards. Ratios of unedited-to-edited proteoforms varied from 35:1 ( Syx1A) to 1:2 ( EndoA). Lys-137 to Glu editing of endophilin A may have functional consequences for its interaction to membrane. The work demonstrates the feasibility to identify the RNA editing event at the proteome level using shotgun proteomics and customized edited protein database.


Subject(s)
Adenosine/metabolism , Drosophila melanogaster/genetics , Inosine/metabolism , Insect Proteins/genetics , Proteogenomics/methods , RNA Editing , Acyltransferases/chemistry , Acyltransferases/genetics , Acyltransferases/metabolism , Adenosine Deaminase/genetics , Adenosine Deaminase/metabolism , Amino Acid Sequence , Animals , Base Sequence , Brain/metabolism , Databases, Protein , Datasets as Topic , Drosophila Proteins/chemistry , Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Drosophila melanogaster/chemistry , Drosophila melanogaster/metabolism , Insect Proteins/classification , Insect Proteins/metabolism , Models, Molecular , Molecular Sequence Annotation , Proteome/genetics , Proteome/metabolism , Qa-SNARE Proteins/genetics , Qa-SNARE Proteins/metabolism , Synaptic Vesicles/chemistry , Synaptic Vesicles/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...