Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Mol Cell Proteomics ; 20: 100076, 2021.
Article in English | MEDLINE | ID: mdl-33823297

ABSTRACT

Proteogenomics approaches often struggle with the distinction between true and false peptide-to-spectrum matches as the database size enlarges. However, features extracted from tandem mass spectrometry intensity predictors can enhance the peptide identification rate and can provide extra confidence for peptide-to-spectrum matching in a proteogenomics context. To that end, features from the spectral intensity pattern predictors MS2PIP and Prosit were combined with the canonical scores from MaxQuant in the Percolator postprocessing tool for protein sequence databases constructed out of ribosome profiling and nanopore RNA-Seq analyses. The presented results provide evidence that this approach enhances both the identification rate as well as the validation stringency in a proteogenomic setting.


Subject(s)
Proteogenomics/methods , Databases, Protein , HCT116 Cells , Humans , Machine Learning , RNA-Seq , Ribosomes
2.
Mol Cell Proteomics ; 18(8 suppl 1): S126-S140, 2019 08 09.
Article in English | MEDLINE | ID: mdl-31040227

ABSTRACT

PROTEOFORMER is a pipeline that enables the automated processing of data derived from ribosome profiling (RIBO-seq, i.e. the sequencing of ribosome-protected mRNA fragments). As such, genome-wide ribosome occupancies lead to the delineation of data-specific translation product candidates and these can improve the mass spectrometry-based identification. Since its first publication, different upgrades, new features and extensions have been added to the PROTEOFORMER pipeline. Some of the most important upgrades include P-site offset calculation during mapping, comprehensive data pre-exploration, the introduction of two alternative proteoform calling strategies and extended pipeline output features. These novelties are illustrated by analyzing ribosome profiling data of human HCT116 and Jurkat data. The different proteoform calling strategies are used alongside one another and in the end combined together with reference sequences from UniProt. Matching mass spectrometry data are searched against this extended search space with MaxQuant. Overall, besides annotated proteoforms, this pipeline leads to the identification and validation of different categories of new proteoforms, including translation products of up- and downstream open reading frames, 5' and 3' extended and truncated proteoforms, single amino acid variants, splice variants and translation products of so-called noncoding regions. Further, proof-of-concept is reported for the improvement of spectrum matching by including Prosit, a deep neural network strategy that adds extra fragmentation spectrum intensity features to the analysis. In the light of ribosome profiling-driven proteogenomics, it is shown that this allows validating the spectrum matches of newly identified proteoforms with elevated stringency. These updates and novel conclusions provide new insights and lessons for the ribosome profiling-based proteogenomic research field. More practical information on the pipeline, raw code, the user manual (README) and explanations on the different modes of availability can be found at the GitHub repository of PROTEOFORMER: https://github.com/Biobix/proteoformer.


Subject(s)
Proteogenomics/methods , Ribosomes/metabolism , Chromatography, Liquid , HCT116 Cells , Humans , Jurkat Cells , Tandem Mass Spectrometry
3.
Comput Methods Programs Biomed ; 181: 104806, 2019 Nov.
Article in English | MEDLINE | ID: mdl-30401579

ABSTRACT

BACKGROUND AND OBJECTIVE: Ribosome profiling is a recent next generation sequencing technique enabling the genome-wide study of gene expression in biomedical research at the translation level. Too often, researchers precipitously start trying to test their hypotheses after alignment of their data, without checking the quality and the general features of their mapped data. Despite the fact that these checks are essential to prevent errors and ensure valid conclusions afterwards, easy-to-use tools for visualizing the quality and overall outlook of mapped ribosome profiling data are lacking. METHODS: We present mQC, a modular tool implemented as a Bioconda package and also available in the Galaxy tool shed. Herewith both bio-informaticians as well as non-experts can easily perform the indispensable visualization of both the quality and the general features of their mapped P-site corrected ribosome profiling reads. The user manual, the raw code and more information can be found on its GitHub repository (https://github.com/Biobix/mQC). RESULTS: mQC was tested on multiple datasets to assess its general applicability and was compared to other tools that partly perform similar tasks. CONCLUSIONS: Our results demonstrate that mQC can accomplish an unfilled but essential position in the ribosome profiling data analysis procedure by performing a thorough RIBO-Seq-specific exploration of aligned and P-site corrected ribosome profiling data.


Subject(s)
Computational Biology/methods , Gene Expression Profiling , Genome-Wide Association Study , Ribosomes/chemistry , Sequence Analysis, DNA , Algorithms , Cell Line, Tumor , Colonic Neoplasms/drug therapy , Cycloheximide/pharmacology , HCT116 Cells , HEK293 Cells , High-Throughput Nucleotide Sequencing , Humans , Open Reading Frames , Quality Control , RNA, Messenger/genetics , Reproducibility of Results , Sequence Analysis, RNA , Software
4.
Genome Res ; 28(1): 25-36, 2018 01.
Article in English | MEDLINE | ID: mdl-29162641

ABSTRACT

Translation initiation generally occurs at AUG codons in eukaryotes, although it has been shown that non-AUG or noncanonical translation initiation can also occur. However, the evidence for noncanonical translation initiation sites (TISs) is largely indirect and based on ribosome profiling (Ribo-seq) studies. Here, using a strategy specifically designed to enrich N termini of proteins, we demonstrate that many human proteins are translated at noncanonical TISs. The large majority of TISs that mapped to 5' untranslated regions were noncanonical and led to N-terminal extension of annotated proteins or translation of upstream small open reading frames (uORF). It has been controversial whether the amino acid corresponding to the start codon is incorporated at the TIS or methionine is still incorporated. We found that methionine was incorporated at almost all noncanonical TISs identified in this study. Comparison of the TISs determined through mass spectrometry with ribosome profiling data revealed that about two-thirds of the novel annotations were indeed supported by the available ribosome profiling data. Sequence conservation across species and a higher abundance of noncanonical TISs than canonical ones in some cases suggests that the noncanonical TISs can have biological functions. Overall, this study provides evidence of protein translation initiation at noncanonical TISs and argues that further studies are required for elucidation of functional implications of such noncanonical translation initiation.


Subject(s)
5' Untranslated Regions , Mass Spectrometry , Open Reading Frames , Peptide Chain Initiation, Translational , Ribosomes/metabolism , HEK293 Cells , Human Umbilical Vein Endothelial Cells/metabolism , Humans , Protein Domains , Ribosomes/genetics
5.
Nucleic Acids Res ; 45(13): 7997-8013, 2017 Jul 27.
Article in English | MEDLINE | ID: mdl-28541577

ABSTRACT

Alternative translation initiation mechanisms such as leaky scanning and reinitiation potentiate the polycistronic nature of human transcripts. By allowing for reprogrammed translation, these mechanisms can mediate biological responses to stimuli. We combined proteomics with ribosome profiling and mRNA sequencing to identify the biological targets of translation control triggered by the eukaryotic translation initiation factor 1 (eIF1), a protein implicated in the stringency of start codon selection. We quantified expression changes of over 4000 proteins and 10 000 actively translated transcripts, leading to the identification of 245 transcripts undergoing translational control mediated by upstream open reading frames (uORFs) upon eIF1 deprivation. Here, the stringency of start codon selection and preference for an optimal nucleotide context were largely diminished leading to translational upregulation of uORFs with suboptimal start. Interestingly, genes affected by eIF1 deprivation were implicated in energy production and sensing of metabolic stress.


Subject(s)
Eukaryotic Initiation Factors/metabolism , Neoplasm Proteins/metabolism , Nerve Tissue Proteins/metabolism , Peptide Chain Initiation, Translational , Cell Line , Codon, Initiator , Energy Metabolism/genetics , Eukaryotic Initiation Factors/antagonists & inhibitors , Eukaryotic Initiation Factors/genetics , Gene Expression , Gene Knockdown Techniques , HCT116 Cells , Humans , Neoplasm Proteins/antagonists & inhibitors , Neoplasm Proteins/genetics , Nerve Tissue Proteins/antagonists & inhibitors , Nerve Tissue Proteins/genetics , Nucleic Acid Conformation , Open Reading Frames , RNA, Messenger/chemistry , RNA, Messenger/genetics , RNA, Messenger/metabolism , Ribosomes/genetics , Ribosomes/metabolism , Stress, Physiological/genetics
6.
Nucleic Acids Res ; 44(D1): D324-9, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26527729

ABSTRACT

With the advent of ribosome profiling, a next generation sequencing technique providing a "snap-shot'' of translated mRNA in a cell, many short open reading frames (sORFs) with ribosomal activity were identified. Follow-up studies revealed the existence of functional peptides, so-called micropeptides, translated from these 'sORFs', indicating a new class of bio-active peptides. Over the last few years, several micropeptides exhibiting important cellular functions were discovered. However, ribosome occupancy does not necessarily imply an actual function of the translated peptide, leading to the development of various tools assessing the coding potential of sORFs. Here, we introduce sORFs.org (http://www.sorfs.org), a novel database for sORFs identified using ribosome profiling. Starting from ribosome profiling, sORFs.org identifies sORFs, incorporates state-of-the-art tools and metrics and stores results in a public database. Two query interfaces are provided, a default one enabling quick lookup of sORFs and a BioMart interface providing advanced query and export possibilities. At present, sORFs.org harbors 263 354 sORFs that demonstrate ribosome occupancy, originating from three different cell lines: HCT116 (human), E14_mESC (mouse) and S2 (fruit fly). sORFs.org aims to provide an extensive sORFs database accessible to researchers with limited bioinformatics knowledge, thus enabling easy integration into personal projects.


Subject(s)
Databases, Genetic , Open Reading Frames , Animals , Base Sequence , Cell Line , Conserved Sequence , Drosophila melanogaster/genetics , High-Throughput Nucleotide Sequencing , Humans , Internet , Mass Spectrometry , Mice , Peptides/chemistry , RNA, Messenger/chemistry , Ribosomes/metabolism , Sequence Analysis, RNA
SELECTION OF CITATIONS
SEARCH DETAIL
...