Pesquisa | Portal Regional da BVS (teste)

sORFs.org: a repository of small ORFs identified by ribosome profiling.

Olexiouk, Volodimir; Crappé, Jeroen; Verbruggen, Steven; Verhegen, Kenneth; Martens, Lennart; Menschaert, Gerben.

Nucleic Acids Res ; 44(D1): D324-9, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26527729

RESUMO

With the advent of ribosome profiling, a next generation sequencing technique providing a "snap-shot'' of translated mRNA in a cell, many short open reading frames (sORFs) with ribosomal activity were identified. Follow-up studies revealed the existence of functional peptides, so-called micropeptides, translated from these 'sORFs', indicating a new class of bio-active peptides. Over the last few years, several micropeptides exhibiting important cellular functions were discovered. However, ribosome occupancy does not necessarily imply an actual function of the translated peptide, leading to the development of various tools assessing the coding potential of sORFs. Here, we introduce sORFs.org (http://www.sorfs.org), a novel database for sORFs identified using ribosome profiling. Starting from ribosome profiling, sORFs.org identifies sORFs, incorporates state-of-the-art tools and metrics and stores results in a public database. Two query interfaces are provided, a default one enabling quick lookup of sORFs and a BioMart interface providing advanced query and export possibilities. At present, sORFs.org harbors 263 354 sORFs that demonstrate ribosome occupancy, originating from three different cell lines: HCT116 (human), E14_mESC (mouse) and S2 (fruit fly). sORFs.org aims to provide an extensive sORFs database accessible to researchers with limited bioinformatics knowledge, thus enabling easy integration into personal projects.

Assuntos

Bases de Dados Genéticas , Fases de Leitura Aberta , Animais , Sequência de Bases , Linhagem Celular , Sequência Conservada , Drosophila melanogaster/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Espectrometria de Massas , Camundongos , Peptídeos/química , RNA Mensageiro/química , Ribossomos/metabolismo , Análise de Sequência de RNA

PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration.

Crappé, Jeroen; Ndah, Elvis; Koch, Alexander; Steyaert, Sandra; Gawron, Daria; De Keulenaer, Sarah; De Meester, Ellen; De Meyer, Tim; Van Criekinge, Wim; Van Damme, Petra; Menschaert, Gerben.

Nucleic Acids Res ; 43(5): e29, 2015 Mar 11.

Artigo em Inglês | MEDLINE | ID: mdl-25510491

RESUMO

An increasing amount of studies integrate mRNA sequencing data into MS-based proteomics to complement the translation product search space. However, several factors, including extensive regulation of mRNA translation and the need for three- or six-frame-translation, impede the use of mRNA-seq data for the construction of a protein sequence search database. With that in mind, we developed the PROTEOFORMER tool that automatically processes data of the recently developed ribosome profiling method (sequencing of ribosome-protected mRNA fragments), resulting in genome-wide visualization of ribosome occupancy. Our tool also includes a translation initiation site calling algorithm allowing the delineation of the open reading frames (ORFs) of all translation products. A complete protein synthesis-based sequence database can thus be compiled for mass spectrometry-based identification. This approach increases the overall protein identification rates with 3% and 11% (improved and new identifications) for human and mouse, respectively, and enables proteome-wide detection of 5'-extended proteoforms, upstream ORF translation and near-cognate translation start sites. The PROTEOFORMER tool is available as a stand-alone pipeline and has been implemented in the galaxy framework for ease of use.

Assuntos

Biologia Computacional/métodos , Espectrometria de Massas/métodos , Proteoma/metabolismo , Proteômica/métodos , Ribossomos/metabolismo , Sequência de Aminoácidos , Animais , Células Cultivadas , Bases de Dados de Proteínas , Genoma/genética , Células HCT116 , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Camundongos , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Biossíntese de Proteínas/genética , Proteoma/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reprodutibilidade dos Testes , Ribossomos/genética , Homologia de Sequência de Aminoácidos

A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites.

Koch, Alexander; Gawron, Daria; Steyaert, Sandra; Ndah, Elvis; Crappé, Jeroen; De Keulenaer, Sarah; De Meester, Ellen; Ma, Ming; Shen, Ben; Gevaert, Kris; Van Criekinge, Wim; Van Damme, Petra; Menschaert, Gerben.

Proteomics ; 14(23-24): 2688-98, 2014 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-25156699

RESUMO

Next-generation transcriptome sequencing is increasingly integrated with MS to enhance MS-based protein and peptide identification. Recently, a breakthrough in transcriptome analysis was achieved with the development of ribosome profiling (ribo-seq). This technology is based on the deep sequencing of ribosome-protected mRNA fragments, thereby enabling the direct observation of in vivo protein synthesis at the transcript level. In order to explore the impact of a ribo-seq-derived protein sequence search space on MS/MS spectrum identification, we performed a comprehensive proteome study on a human cancer cell line, using both shotgun and N-terminal proteomics, next to ribosome profiling, which was used to delineate (alternative) translational reading frames. By including protein-level evidence of sample-specific genetic variation and alternative translation, this strategy improved the identification score of 69 proteins and identified 22 new proteins in the shotgun experiment. Furthermore, we discovered 18 new alternative translation start sites in the N-terminal proteomics data and observed a correlation between the quantitative measures of ribo-seq and shotgun proteomics with a Pearson correlation coefficient ranging from 0.483 to 0.664. Overall, this study demonstrated the benefits of ribosome profiling for MS-based protein and peptide identification and we believe this approach could develop into a common practice for next-generation proteomics.

Assuntos

Biologia Computacional/métodos , Proteínas/metabolismo , Proteômica/métodos , Ribossomos/metabolismo , Células HCT116 , Humanos , Biossíntese de Proteínas/genética , Proteínas/genética , Espectrometria de Massas em Tandem

Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs.

Crappé, Jeroen; Van Criekinge, Wim; Trooskens, Geert; Hayakawa, Eisuke; Luyten, Walter; Baggerman, Geert; Menschaert, Gerben.

BMC Genomics ; 14: 648, 2013 Sep 23.

Artigo em Inglês | MEDLINE | ID: mdl-24059539

RESUMO

BACKGROUND: It was long assumed that proteins are at least 100 amino acids (AAs) long. Moreover, the detection of short translation products (e.g. coded from small Open Reading Frames, sORFs) is very difficult as the short length makes it hard to distinguish true coding ORFs from ORFs occurring by chance. Nevertheless, over the past few years many such non-canonical genes (with ORFs < 100 AAs) have been discovered in different organisms like Arabidopsis thaliana, Saccharomyces cerevisiae, and Drosophila melanogaster. Thanks to advances in sequencing, bioinformatics and computing power, it is now possible to scan the genome in unprecedented scrutiny, for example in a search of this type of small ORFs. RESULTS: Using bioinformatics methods, we performed a systematic search for putatively functional sORFs in the Mus musculus genome. A genome-wide scan detected all sORFs which were subsequently analyzed for their coding potential, based on evolutionary conservation at the AA level, and ranked using a Support Vector Machine (SVM) learning model. The ranked sORFs are finally overlapped with ribosome profiling data, hinting to sORF translation. All candidates are visually inspected using an in-house developed genome browser. In this way dozens of highly conserved sORFs, targeted by ribosomes were identified in the mouse genome, putatively encoding micropeptides. CONCLUSION: Our combined genome-wide approach leads to the prediction of a comprehensive but manageable set of putatively coding sORFs, a very important first step towards the identification of a new class of bioactive peptides, called micropeptides.

Assuntos

Biologia Computacional/métodos , Simulação por Computador , Genoma/genética , Fases de Leitura Aberta/genética , Ribossomos/genética , Animais , Sequência de Bases , DNA Intergênico/genética , Células-Tronco Embrionárias/metabolismo , Camundongos , Dados de Sequência Molecular , Peptídeos/genética , RNA não Traduzido/genética , Reprodutibilidade dos Testes , Alinhamento de Sequência

Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events.

Menschaert, Gerben; Van Criekinge, Wim; Notelaers, Tineke; Koch, Alexander; Crappé, Jeroen; Gevaert, Kris; Van Damme, Petra.

Mol Cell Proteomics ; 12(7): 1780-90, 2013 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-23429522

RESUMO

An increasing number of studies involve integrative analysis of gene and protein expression data, taking advantage of new technologies such as next-generation transcriptome sequencing and highly sensitive mass spectrometry (MS) instrumentation. Recently, a strategy, termed ribosome profiling (or RIBO-seq), based on deep sequencing of ribosome-protected mRNA fragments, indirectly monitoring protein synthesis, has been described. We devised a proteogenomic approach constructing a custom protein sequence search space, built from both Swiss-Prot- and RIBO-seq-derived translation products, applicable for MS/MS spectrum identification. To record the impact of using the constructed deep proteome database, we performed two alternative MS-based proteomic strategies as follows: (i) a regular shotgun proteomic and (ii) an N-terminal combined fractional diagonal chromatography (COFRADIC) approach. Although the former technique gives an overall assessment on the protein and peptide level, the latter technique, specifically enabling the isolation of N-terminal peptides, is very appropriate in validating the RIBO-seq-derived (alternative) translation initiation site profile. We demonstrate that this proteogenomic approach increases the overall protein identification rate 2.5% (e.g. new protein products, new protein splice variants, single nucleotide polymorphism variant proteins, and N-terminally extended forms of known proteins) as compared with only searching UniProtKB-SwissProt. Furthermore, using this custom database, identification of N-terminal COFRADIC data resulted in detection of 16 alternative start sites giving rise to N-terminally extended protein variants besides the identification of four translated upstream ORFs. Notably, the characterization of these new translation products revealed the use of multiple near-cognate (non-AUG) start codons. As deep sequencing techniques are becoming more standard, less expensive, and widespread, we anticipate that mRNA sequencing and especially custom-tailored RIBO-seq will become indispensable in the MS-based protein or peptide identification process. The underlying mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD000124.

Assuntos

Bases de Dados de Proteínas , Proteoma , Proteômica/métodos , Animais , Linhagem Celular , Cromatografia , Sequenciamento de Nucleotídeos em Larga Escala , Camundongos , Peptídeos/genética , Ribossomos/genética , Espectrometria de Massas em Tandem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA