Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters










Publication year range
1.
Biology (Basel) ; 13(4)2024 Apr 18.
Article in English | MEDLINE | ID: mdl-38666884

ABSTRACT

Obesity is a socially significant disease that is characterized by a disproportionate accumulation of fat. It is also associated with chronic inflammation, cancer, diabetes, and other comorbidities. Investigating biomarkers and pathological processes linked to obesity is especially vital for young individuals, given their increased potential for lifestyle modifications. By comparing the genetic, proteomic, and metabolomic profiles of individuals categorized as underweight, normal, overweight, and obese, we aimed to determine which omics layer most accurately reflects the phenotypic changes in an organism that result from obesity. We profiled blood plasma samples by employing three omics methodologies. The untargeted GC×GC-MS metabolomics approach identified 313 metabolites. To augment the metabolomic dataset, we integrated a label-free HPLC-MS/MS proteomics method, leading to the identification of 708 proteins. The genomic layer encompassed the genotyping of 647,250 SNPs. Utilizing omics data, we trained sparse Partial Least Squares models to predict body mass index. Molecular features exhibiting frequently non-zero coefficients were selected as potential biomarkers, and we further explored enriched biological pathways. Proteomics was the most effective in single-omics analyses, with a median absolute error (MAE) of 5.44 ± 0.31 kg/m2, incorporating an average of 24 proteins per model. Metabolomics showed slightly lower performance (MAE = 6.06 ± 0.33 kg/m2), followed by genomics (MAE = 6.20 ± 0.34 kg/m2). As expected, multiomic models demonstrated better accuracy, particularly the combination of proteomics and metabolomics (MAE = 4.77 ± 0.33 kg/m2), while including genomics data did not enhance the results. This manuscript is the first multiomics study of obesity in a gender-balanced cohort of young adults profiled by genomic, proteomic, and metabolomic methods. The comprehensive approach provides novel insights into the molecular mechanisms of obesity, opening avenues for more targeted interventions.

2.
J Proteome Res ; 22(6): 1695-1711, 2023 06 02.
Article in English | MEDLINE | ID: mdl-37158322

ABSTRACT

The proteogenomic search pipeline developed in this work has been applied for reanalysis of 40 publicly available shotgun proteomic datasets from various human tissues comprising more than 8000 individual LC-MS/MS runs, of which 5442 .raw data files were processed in total. This reanalysis was focused on searching for ADAR-mediated RNA editing events, their clustering across samples of different origins, and classification. In total, 33 recoded protein sites were identified in 21 datasets. Of those, 18 sites were detected in at least two datasets, representing the core human protein editome. In agreement with prior artworks, neural and cancer tissues were found to be enriched with recoded proteins. Quantitative analysis indicated that recoding the rate of specific sites did not directly depend on the levels of ADAR enzymes or targeted proteins themselves, rather it was governed by differential and yet undescribed regulation of interaction of enzymes with mRNA. Nine recoding sites conservative between humans and rodents were validated by targeted proteomics using stable isotope standards in the murine brain cortex and cerebellum, and an additional one was validated in human cerebrospinal fluid. In addition to previous data of the same type from cancer proteomes, we provide a comprehensive catalog of recoding events caused by ADAR RNA editing in the human proteome.


Subject(s)
Proteogenomics , Proteomics , Humans , Animals , Mice , RNA/metabolism , RNA Editing , Chromatography, Liquid , Tandem Mass Spectrometry , Proteome/genetics , Proteome/metabolism , Adenosine/metabolism , Inosine/genetics , Inosine/metabolism
3.
Biology (Basel) ; 12(2)2023 Jan 28.
Article in English | MEDLINE | ID: mdl-36829477

ABSTRACT

Although modern biology is now in the post-genomic era with vastly increased access to high-quality data, the set of human genes with a known function remains far from complete. This is especially true for hundreds of mitochondria-associated genes, which are under-characterized and lack clear functional annotation. However, with the advent of multi-omics profiling methods coupled with systems biology algorithms, the cellular role of many such genes can be elucidated. Here, we report genes and pathways associated with TOMM34, Translocase of Outer Mitochondrial Membrane, which plays role in the mitochondrial protein import as a part of cytosolic complex together with Hsp70/Hsp90 and is upregulated in various cancers. We identified genes, proteins, and metabolites altered in TOMM34-/- HepG2 cells. To our knowledge, this is the first attempt to study the functional capacity of TOMM34 using a multi-omics strategy. We demonstrate that TOMM34 affects various processes including oxidative phosphorylation, citric acid cycle, metabolism of purine, and several amino acids. Besides the analysis of already known pathways, we utilized de novo network enrichment algorithm to extract novel perturbed subnetworks, thus obtaining evidence that TOMM34 potentially plays role in several other cellular processes, including NOTCH-, MAPK-, and STAT3-signaling. Collectively, our findings provide new insights into TOMM34's cellular functions.

4.
Int J Mol Sci ; 23(9)2022 May 08.
Article in English | MEDLINE | ID: mdl-35563635

ABSTRACT

Cancer cell lines responded differentially to type I interferon treatment in models of oncolytic therapy using vesicular stomatitis virus (VSV). Two opposite cases were considered in this study, glioblastoma DBTRG-05MG and osteosarcoma HOS cell lines exhibiting resistance and sensitivity to VSV after the treatment, respectively. Type I interferon responses were compared for these cell lines by integrative analysis of the transcriptome, proteome, and RNA editome to identify molecular factors determining differential effects observed. Adenosine-to-inosine RNA editing was equally induced in both cell lines. However, transcriptome analysis showed that the number of differentially expressed genes was much higher in DBTRG-05MG with a specific enrichment in inflammatory proteins. Further, it was found that two genes, EGFR and HER2, were overexpressed in HOS cells compared with DBTRG-05MG, supporting recent reports that EGF receptor signaling attenuates interferon responses via HER2 co-receptor activity. Accordingly, combined treatment of cells with EGF receptor inhibitors such as gefitinib and type I interferon increases the resistance of sensitive cell lines to VSV. Moreover, sensitive cell lines had increased levels of HER2 protein compared with non-sensitive DBTRG-05MG. Presumably, the level of this protein expression in tumor cells might be a predictive biomarker of their resistance to oncolytic viral therapy.


Subject(s)
Interferon Type I , Oncolytic Virotherapy , Oncolytic Viruses , Vesicular Stomatitis , Animals , Cell Line, Tumor , ErbB Receptors/genetics , Interferon Type I/metabolism , Oncolytic Viruses/physiology , Vesicular stomatitis Indiana virus/genetics , Vesiculovirus/physiology
5.
Scientometrics ; 127(4): 1953-1967, 2022.
Article in English | MEDLINE | ID: mdl-35221395

ABSTRACT

The paper describes a scheme for the comparative analysis of the sets of Pubmed publications. The proposed analysis is based on the comparison of the frequencies of occurrence of keywords-MeSH terms. The purpose of the analysis is to identify MeSH terms that characterize research areas specific to each group of articles, as well as to identify trends-topics on which the number of published works has changed significantly in recent years. The proposed approach was tested by comparing a set of medical publications and a group of articles in the field of personalized medicine. We analyzed about 700 thousand abstracts published in the period 2009-2021 and indexed them with MeSH terms. Topics with increasing research interest have been identified both in the field of medicine in general and specific to personalized medicine. Retrospective analysis of the keywords frequency of occurrence changes has shown the shift of the scientific priorities in this area over the past 10 years. The revealed patterns can be used to predict the relevance and significance of the scientific work direction in the horizon of 3-5 years. The proposed analysis can be scaled in the future for a larger number of groups of publications, as well as adjusted by introducing filters at the stage of sampling (scientific centers, journals, availability of full texts, etc.) or selecting a list of keywords (frequency threshold, use of qualifiers, category of generalizations). SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11192-022-04292-y.

6.
Biology (Basel) ; 10(11)2021 Nov 04.
Article in English | MEDLINE | ID: mdl-34827124

ABSTRACT

Long-read direct RNA sequencing developed by Oxford Nanopore Technologies (ONT) is quickly gaining popularity for transcriptome studies, while fast turnaround time and low cost make it an attractive instrument for clinical applications. There is a growing interest to utilize transcriptome data to unravel activated biological processes responsible for disease progression and response to therapies. This trend is of particular interest for precision medicine which aims at single-patient analysis. Here we evaluated whether gene abundances measured by MinION direct RNA sequencing are suited to produce robust estimates of pathway activation for single sample scoring methods. We performed multiple RNA-seq analyses for a single sample that originated from the HepG2 cell line, namely five ONT replicates, and three replicates using Illumina NovaSeq. Two pathway scoring methods were employed-ssGSEA and singscore. We estimated the ONT performance in terms of detected protein-coding genes and average pairwise correlation between pathway activation scores using an exhaustive computational scheme for all combinations of replicates. In brief, we found that at least two ONT replicates are required to obtain reproducible pathway scores for both algorithms. We hope that our findings may be of interest to researchers planning their ONT direct RNA-seq experiments.

7.
J Pers Med ; 11(2)2021 Jan 21.
Article in English | MEDLINE | ID: mdl-33494491

ABSTRACT

Obesity is a frightening chronic disease, which has tripled since 1975. It is not expected to slow down staying one of the leading cases of preventable death and resulting in an increased clinical and economic burden. Poor lifestyle choices and excessive intake of "cheap calories" are major contributors to obesity, triggering type 2 diabetes, cardiovascular diseases, and other comorbidities. Understanding the molecular mechanisms responsible for development of obesity is essential as it might result in the introducing of anti-obesity targets and early-stage obesity biomarkers, allowing the distinction between metabolic syndromes. The complex nature of this disease, coupled with the phenomenon of metabolically healthy obesity, inspired us to perform data-centric, hypothesis-generating pilot research, aimed to find correlations between parameters of classic clinical blood tests and proteomic profiles of 104 lean and obese subjects. As the result, we assembled patterns of proteins, which presence or absence allows predicting the weight of the patient fairly well. We believe that such proteomic patterns with high prediction power should facilitate the translation of potential candidates into biomarkers of clinical use for early-stage stratification of obesity therapy.

8.
J Proteomics ; 231: 104022, 2021 01 16.
Article in English | MEDLINE | ID: mdl-33096305

ABSTRACT

In order to optimize sample preparation for shotgun proteomics, we compared four cysteine alkylating agents: iodoacetamide, chloroacetamide, 4-vinylpyridine and methyl methanethiosulfonate, and estimated their effects on the results of proteome analysis. Because alkylation may result in methionine modification in vitro, proteomics data were searched for methionine to isothreonine conversions, which may mimic genomic methionine to threonine substitutions found in proteogenomic analyses. We found that chloroacetamide was superior to the other reagents in terms of the number of identified peptides and undesirable off-site reactions. Among the reagents evaluated, iodoacetamide increased the rate of methionine-to-isothreonine conversion, especially if the sample was prepared in gel. The presence of proline following methionine in a protein sequence increased the modification rate as well. Generally, the methionine-to-isothreonine conversion events were relatively rare, but should be taken into account in proteogenomic studies when searching for single nucleotide polymorphism events at the protein level. Additionally, we have evaluated other methionine modifications, such as oxidation and carbamidomethylation. We found that carbamidomethylation may affect up to 80% of peptides containing methionine under the condition of iodoacetamide alkylation. In this case, carbamidomethylation of methionine is more common than oxidation and should be accounted for as a variable modification during proteomic search. SIGNIFICANCE: One of the most trending questions in bottom-up proteomics is the depth of proteome profiling, in other words, the coverage of proteins by identified tryptic peptides. In proteogenomics, where the identification of a single peptide, e.g. bearing an amino acid substitution, may be of interest, high sequence coverage is especially important. Chemical modifications during sample preparation may mimic biologically significant coding mutations at the proteome level. A typical example of such modification is methionine to isothreonine conversion during alkylation, which mimics methionine to threonine substitution in protein sequences due to respective genomic mutations. Therefore, the studies on the proper selection of alkylating reagents which balance the cysteine alkylation efficiency and the extent of methionine conversion upon conventional proteomic sample preparation workflow are crucial for the outcome of proteogenomic analyses and should present a general interest for the proteomic community.


Subject(s)
Cysteine , Proteomics , Alkylation , Iodoacetamide , Methionine
9.
J Proteome Res ; 19(10): 4046-4060, 2020 10 02.
Article in English | MEDLINE | ID: mdl-32866021

ABSTRACT

Adenosine-to-inosine RNA editing is an enzymatic post-transcriptional modification which modulates immunity and neural transmission in multicellular organisms. In particular, it involves editing of mRNA codons with the resulting amino acid substitutions. We identified such sites for developmental proteomes of Drosophila melanogaster at the protein level using available data for 15 stages of fruit fly development from egg to imago and 14 time points of embryogenesis. In total, 40 sites were obtained, each belonging to a unique protein, including four sites related to embryogenesis. The interactome analysis has revealed that the majority of the editing-recoded proteins were associated with synaptic vesicle trafficking and actomyosin organization. Quantitation data analysis suggested the existence of a phase-specific RNA editing regulation with yet unknown mechanisms. These findings supported the transcriptome analysis results, which showed that a burst in the RNA editing occurs during insect metamorphosis from pupa to imago. Finally, targeted proteomic analysis was performed to quantify editing-recoded and genomically encoded versions of five proteins in brains of larvae, pupae, and imago insects, which showed a clear tendency toward an increase in the editing rate for each of them. These results will allow a better understanding of the protein role in physiological effects of RNA editing.


Subject(s)
Drosophila Proteins , RNA Editing , Adenosine Deaminase/genetics , Adenosine Deaminase/metabolism , Animals , Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Inosine/metabolism , Proteome/genetics , Proteome/metabolism , Proteomics , RNA, Messenger/genetics
10.
Proteomics ; 19(23): e1900195, 2019 12.
Article in English | MEDLINE | ID: mdl-31576663

ABSTRACT

Proteogenomics is based on the use of customized genome or RNA sequencing databases for interrogation of shotgun proteomics data in search for proteome-level evidence of genome variations or RNA editing. In this work, the products of adenosine-to-inosine RNA editing in human and murine brain proteomes are identified using publicly available brain proteome LC-MS/MS datasets and an RNA editome database compiled from several sources. After filtering of false-positive results, 20 and 37 sites of editing in proteins belonging to 14 and 32 genes are identified for murine and human brain proteomes, respectively. Eight sites of editing identified with high spectral counts overlapped between human and mouse brain samples. Some of these sites have been previously reported using orthogonal methods, such as α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) glutamate receptors, CYFIP2, coatomer alpha. Also, differential editing between neurons and microglia is demonstrated in this work for some of the proteins from primary murine brain cell cultures. Because many edited sites are still not characterized functionally at the protein level, the results provide a necessary background for their further analysis in normal and diseased cells and tissues using targeted proteomic approaches.


Subject(s)
Adenosine/metabolism , Brain/metabolism , Inosine/metabolism , RNA Editing/genetics , Adaptor Proteins, Signal Transducing/metabolism , Animals , Cells, Cultured , Coatomer Protein/metabolism , Humans , Mice , Proteome/metabolism , Proteomics/methods
11.
J Proteome Res ; 18(1): 120-129, 2019 01 04.
Article in English | MEDLINE | ID: mdl-30480452

ABSTRACT

This work continues the series of the quantitative measurements of the proteins encoded by different chromosomes in the blood plasma of a healthy person. Selected Reaction Monitoring with Stable Isotope-labeled peptide Standards (SRM SIS) and a gene-centric approach, which is the basis for the implementation of the international Chromosome-centric Human Proteome Project (C-HPP), were applied for the quantitative measurement of proteins in human blood plasma. Analyses were carried out in the frame of C-HPP for each protein-coding gene of the four human chromosomes: 18, 13, Y, and mitochondrial. Concentrations of proteins encoded by 667 genes were measured in 54 blood plasma samples of the volunteers, whose health conditions were consistent with requirements for astronauts. The gene list included 276, 329, 47, and 15 genes of chromosomes 18, 13, Y, and the mitochondrial chromosome, respectively. This paper does not make claims about the detection of missing proteins. Only 205 proteins (30.7%) were detected in the samples. Of them, 84, 106, 10, and 5 belonged to chromosomes 18, 13, and Y and the mitochondrial chromosome, respectively. Each detected protein was found in at least one of the samples analyzed. The SRM SIS raw data are available in the ProteomeXchange repository (PXD004374, PASS01192).


Subject(s)
Chromosomes, Human/chemistry , Plasma/chemistry , Proteome , Chromosomes, Human/genetics , Chromosomes, Human, Pair 13/chemistry , Chromosomes, Human, Pair 18/chemistry , Chromosomes, Human, Y/chemistry , Databases, Protein , Healthy Volunteers , Humans , Mitochondria/ultrastructure , Proteome/genetics
12.
J Proteome Res ; 17(5): 1801-1811, 2018 05 04.
Article in English | MEDLINE | ID: mdl-29619825

ABSTRACT

The identification of genetically encoded variants at the proteome level is an important problem in cancer proteogenomics. The generation of customized protein databases from DNA or RNA sequencing data is a crucial stage of the identification workflow. Genomic data filtering applied at this stage may significantly modify variant search results, yet its effect is generally left out of the scope of proteogenomic studies. In this work, we focused on this impact using data of exome sequencing and LC-MS/MS analyses of six replicates for eight melanoma cell lines processed by a proteogenomics workflow. The main objectives were identifying variant peptides and revealing the role of the genomic data filtering in the variant identification. A series of six confidence thresholds for single nucleotide polymorphisms and indels from the exome data were applied to generate customized sequence databases of different stringency. In the searches against unfiltered databases, between 100 and 160 variant peptides were identified for each of the cell lines using X!Tandem and MS-GF+ search engines. The recovery rate for variant peptides was ∼1%, which is approximately three times lower than that of the wild-type peptides. Using unfiltered genomic databases for variant searches resulted in higher sensitivity and selectivity of the proteogenomic workflow and positively affected the ability to distinguish the cell lines based on variant peptide signatures.


Subject(s)
Databases, Protein , Exome/genetics , Genetic Variation , Melanoma/pathology , Proteogenomics/methods , Animals , Cell Line, Tumor , Chromatography, Liquid , Humans , INDEL Mutation , Polymorphism, Single Nucleotide , Proteomics/methods , Search Engine , Tandem Mass Spectrometry
13.
PLoS One ; 12(5): e0177427, 2017.
Article in English | MEDLINE | ID: mdl-28493947

ABSTRACT

Liquid chromatography-tandem mass spectrometry was used to analyze plasma proteins of volunteers (control) and patients with glioblastoma multiform (GBM). A database search was pre-set with a variable post-translational modification (PTM): phosphorylation, acetylation or ubiquitination. There were no significant differences between the control and the GBM groups regarding the number of protein identifications, sequence coverage or number of PTMs. However, in GBM plasma, we unambiguously observed a decreased fraction in post-translationally modified peptides identified with high quality. The disease-specific PTM patterns were extracted and mapped to the set of FDA-approved plasma protein markers. Decreases of 46% and 24% in the number of acetylated and ubiquitinated peptides, respectively, were observed in the GBM samples. Significance of capturing disease-associated patterns of protein modifications was envisaged.


Subject(s)
Biomarkers/blood , Glioblastoma/blood , Glioblastoma/metabolism , Acetylation , Chromatography, Liquid , Humans , Phosphorylation , Protein Processing, Post-Translational , Tandem Mass Spectrometry , Ubiquitination , alpha-2-HS-Glycoprotein/metabolism
14.
Int J Anal Chem ; 2016: 7436849, 2016.
Article in English | MEDLINE | ID: mdl-27298622

ABSTRACT

This work discusses bioinformatics and experimental approaches to explore the human proteome, a constellation of proteins expressed in different tissues and organs. As the human proteome is not a static entity, it seems necessary to estimate the number of different protein species (proteoforms) and measure the number of copies of the same protein in a specific tissue. Here, meta-analysis of neXtProt knowledge base is proposed for theoretical prediction of the number of different proteoforms that arise from alternative splicing (AS), single amino acid polymorphisms (SAPs), and posttranslational modifications (PTMs). Three possible cases are considered: (1) PTMs and SAPs appear exclusively in the canonical sequences of proteins, but not in splice variants; (2) PTMs and SAPs can occur in both proteins encoded by canonical sequences and in splice variants; (3) all modification types (AS, SAP, and PTM) occur as independent events. Experimental validation of proteoforms is limited by the analytical sensitivity of proteomic technology. A bell-shaped distribution histogram was generated for proteins encoded by a single chromosome, with the estimation of copy numbers in plasma, liver, and HepG2 cell line. The proposed metabioinformatics approaches can be used for estimation of the number of different proteoforms for any group of protein-coding genes.

15.
Proteomics ; 16(13): 1938-46, 2016 07.
Article in English | MEDLINE | ID: mdl-27193151

ABSTRACT

Twenty-nine human aqueous humor samples from patients with eye diseases such as cataract and glaucoma with and without pseudoexfoliation syndrome were characterized by LC-high resolution MS analysis. In total, 269 protein groups were identified with 1% false discovery rate including 32 groups that were not reported previously for this biological fluid. Since the samples were analyzed individually, but not pooled, 36 proteins were identified in all samples, comprising the constitutive proteome of the fluid. The most dominant molecular function of aqueous humor proteins as determined by GO analysis is endopeptidase inhibitor activity. Label-free protein quantification showed no significant difference between glaucoma and cataract aqueous humor proteomes. At the same time, we found decrease in the level of apolipoprotein D as a marker of the pseudoexfoliation syndrome. The data are available from ProteomeXchange repository (PXD002623).


Subject(s)
Aqueous Humor/chemistry , Cataract/diagnosis , Exfoliation Syndrome/diagnosis , Glaucoma/diagnosis , Proteome/analysis , Aged , Aged, 80 and over , Apolipoproteins D/analysis , Biomarkers/analysis , Chromatography, Liquid , Humans , Middle Aged , Tandem Mass Spectrometry
16.
Rapid Commun Mass Spectrom ; 30(11): 1323-31, 2016 06 15.
Article in English | MEDLINE | ID: mdl-27173114

ABSTRACT

RATIONALE: One of the problems in proteogenomic research aimed at identification of variant peptides is the presence of peptides with amino acid isomers of different origin in the analyzed samples. Among the most challenging examples are peptides with threonine and isothreonine (homoserine) in their sequences. Indeed, the latter residue may appear in vitro as a methionine substitution during sample preparation for shotgun proteome analysis. Yet, this substitution of Met to isoThr is not encoded genetically and should be unambiguously distinguished from, e.g., point mutations in proteins that result in Met conversion to Thr. METHODS: In this work we compared tandem mass (MS/MS) spectra produced by an Orbitrap mass spectrometer of Thr- and isoThr-containing tryptic peptides and found a distinctive feature in their collisionally activated fragmentation patterns. RESULTS: Up to 84% of MS/MS spectra for peptides containing isoThr residues have been positively specified. We also studied the differences in retention times for peptides containing Thr isoforms that can be further used for their distinction. CONCLUSIONS: Threonine can be distinguished from isothreonine by its retention time and HCD fragmentation pattern, specifically relative intensity of the bn - product ion, which can be further used in proteomic research. Copyright © 2016 John Wiley & Sons, Ltd.


Subject(s)
Chromatography, High Pressure Liquid/methods , Peptides/chemistry , Tandem Mass Spectrometry/methods , Threonine/analysis , Amino Acid Sequence , Humans , Isomerism
17.
J Proteomics ; 120: 169-78, 2015 Apr 29.
Article in English | MEDLINE | ID: mdl-25779464

ABSTRACT

Searching deep proteome data for 9 NCI-60 cancer cell lines obtained earlier by Moghaddas Gholami et al. (Cell Reports, 2013) against a database from cancer genomes returned a variant tryptic peptide fragment 57-72 of molecular chaperone HSC70, in which methionine residue at 61 position is replaced by threonine, or isothreonine (homoserine), residue. However, no traces of the corresponding genetic alteration were found in the cell line genomes reported by Abaan et al. (Cancer Research, 2013). Studying on the background of this modification led us to conclude that a conversion of methionine into isothreonine resulted from iodoacetamide treatment of the probe during a sample preparation step. We found that up to 10% of methionine containing peptides experienced the above conversion for the datasets under study. The artifact was confirmed by model experiment with bovine albumin, where three of four methionine residues were partly converted to isothreonine by conventional iodoacetamide treatment. This experimental side reaction has to be taken into account when searching for genetically encoded peptide variants in the proteogenomics studies. BIOLOGICAL SIGNIFICANCE: A lot of effort is currently put into proteogenomics of cancer. Studies detect non-synonymous cancer mutations at protein level by search of high-throughput LC-MS/MS data against customized genomic databases. In such studies, much attention is paid to potential false positive identifications. Here we describe one possible cause of such false identifications, an artifact of sample preparation which mimics methionine to threonine nucleic acid-encoded variant. The methionine to isothreonine conversion should be taken into consideration for correct interpretation of proteogenomic data.


Subject(s)
Amino Acid Substitution/genetics , Artifacts , Methionine/genetics , Neoplasms/genetics , Proteome/genetics , Threonine/genetics , Cell Line, Tumor , False Positive Reactions , Genetic Markers/genetics , Genetic Variation/genetics , Humans , Proteomics/methods , Reproducibility of Results , Sensitivity and Specificity
18.
J Proteome Res ; 13(12): 5551-60, 2014 Dec 05.
Article in English | MEDLINE | ID: mdl-25333775

ABSTRACT

Cancer genome deviates significantly from the reference human genome, and thus a search against standard genome databases in cancer cell proteomics fails to identify cancer-specific protein variants. The goal of this Article is to combine high-throughput exome data [Abaan et al. Cancer Res. 2013] and shotgun proteomics analysis [Modhaddas Gholami et al. Cell Rep. 2013] for cancer cell lines from NCI-60 panel to demonstrate further that the cell lines can be effectively recognized using identified variant peptides. To achieve this goal, we generated a database containing mutant protein sequences of NCI-60 panel of cell lines. The proteome data were searched using Mascot and X!Tandem search engines against databases of both reference and mutant protein sequences. The identification quality was further controlled by calculating a fraction of variant peptides encoded by the own exome sequence for each cell line. We found that up to 92.2% peptides identified by both search engines are encoded by the own exome. Further, we used the identified variant peptides for cell line recognition. The results of the study demonstrate that proteome data supported by exome sequence information can be effectively used for distinguishing between different types of cancer cell lines.


Subject(s)
Biomarkers, Tumor/metabolism , Exome , Proteome/metabolism , Amino Acid Sequence , Biomarkers, Tumor/chemistry , Biomarkers, Tumor/genetics , Cell Line, Tumor , Humans , Mutation, Missense , Peptide Fragments/chemistry , Polymorphism, Single Nucleotide , Proteome/chemistry , Proteome/genetics
19.
PLoS One ; 9(8): e103950, 2014.
Article in English | MEDLINE | ID: mdl-25083712

ABSTRACT

BACKGROUND: There are two ways that statistical methods can learn from biomedical data. One way is to learn classifiers to identify diseases and to predict outcomes using the training dataset with established diagnosis for each sample. When the training dataset is not available the task can be to mine for presence of meaningful groups (clusters) of samples and to explore underlying data structure (unsupervised learning). RESULTS: We investigated the proteomic profiles of the cytosolic fraction of human liver samples using two-dimensional electrophoresis (2DE). Samples were resected upon surgical treatment of hepatic metastases in colorectal cancer. Unsupervised hierarchical clustering of 2DE gel images (n = 18) revealed a pair of clusters, containing 11 and 7 samples. Previously we used the same specimens to measure biochemical profiles based on cytochrome P450-dependent enzymatic activities and also found that samples were clearly divided into two well-separated groups by cluster analysis. It turned out that groups by enzyme activity almost perfectly match to the groups identified from proteomic data. Of the 271 reproducible spots on our 2DE gels, we selected 15 to distinguish the human liver cytosolic clusters. Using MALDI-TOF peptide mass fingerprinting, we identified 12 proteins for the selected spots, including known cancer-associated species. CONCLUSIONS/SIGNIFICANCE: Our results highlight the importance of hierarchical cluster analysis of proteomic data, and showed concordance between results of biochemical and proteomic approaches. Grouping of the human liver samples and/or patients into differing clusters may provide insights into possible molecular mechanism of drug metabolism and creates a rationale for personalized treatment.


Subject(s)
Liver Neoplasms/metabolism , Neoplasm Proteins/metabolism , Proteomics , Cluster Analysis , Cytosol/metabolism , Electrophoresis, Gel, Two-Dimensional , Humans , Microsomes, Liver/metabolism , Reproducibility of Results , Software , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
20.
BMC Evol Biol ; 12: 200, 2012 Oct 06.
Article in English | MEDLINE | ID: mdl-23039862

ABSTRACT

BACKGROUND: The exponential growth of the number of fully sequenced genomes at varying taxonomic closeness allows one to characterize transcriptional regulation using comparative-genomics analysis instead of time-consuming experimental methods. A transcriptional regulatory unit consists of a transcription factor, its binding site and a regulated gene. These units constitute a graph which contains so-called "network motifs", subgraphs of a given structure. Here we consider genomes of closely related Enterobacteriales and estimate the fraction of conserved network motifs and sites as well as positions under selection in various types of non-coding regions. RESULTS: Using a newly developed technique, we found that the highest fraction of positions under selection, approximately 50%, was observed in synvergon spacers (between consecutive genes from the same strand), followed by ~45% in divergon spacers (common 5'-regions), and ~10% in convergon spacers (common 3'-regions). The fraction of selected positions in functional regions was higher, 60% in transcription factor-binding sites and ~45% in terminators and promoters. Small, but significant differences were observed between Escherichia coli and Salmonella enterica. This fraction is similar to the one observed in eukaryotes.The conservation of binding sites demonstrated some differences between types of regulatory units. In E. coli, strains the interactions of the type "local transcriptional factor gene" turned out to be more conserved in feed-forward loops (FFLs) compared to non-motif interactions. The coherent FFLs tend to be less conserved than the incoherent FFLs. A natural explanation is that the former imply functional redundancy. CONCLUSIONS: A naïve hypothesis that FFL would be highly conserved turned out to be not entirely true: its conservation depends on its status in the transcriptional network and also from its usage. The fraction of positions under selection in intergenic regions of bacterial genomes is roughly similar to that of eukaryotes. Known regulatory sites explain 20±5% of selected positions.


Subject(s)
Escherichia coli/genetics , Evolution, Molecular , Gene Expression Regulation, Bacterial , Genome, Bacterial , Salmonella enterica/genetics , Binding Sites , Conserved Sequence , DNA, Intergenic/genetics , Gene Regulatory Networks , Selection, Genetic , Transcription Factors/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...