Pesquisa | Portal Regional da BVS

SpliceProt 2.0: A Sequence Repository of Human, Mouse, and Rat Proteoforms.

Santos, Letícia Graziela Costa; Parreira, Vinícius da Silva Coutinho; da Silva, Esdras Matheus Gomes; Santos, Marlon Dias Mariano; Fernandes, Alexander da Franca; Neves-Ferreira, Ana Gisele da Costa; Carvalho, Paulo Costa; Freitas, Flávia Cristina de Paula; Passetti, Fabio.

Int J Mol Sci ; 25(2)2024 Jan 18.

Artigo em Inglês | MEDLINE | ID: mdl-38256255

RESUMO

SpliceProt 2.0 is a public proteogenomics database that aims to list the sequence of known proteins and potential new proteoforms in human, mouse, and rat proteomes. This updated repository provides an even broader range of computationally translated proteins and serves, for example, to aid with proteomic validation of splice variants absent from the reference UniProtKB/SwissProt database. We demonstrate the value of SpliceProt 2.0 to predict orthologous proteins between humans and murines based on transcript reconstruction, sequence annotation and detection at the transcriptome and proteome levels. In this release, the annotation data used in the reconstruction of transcripts based on the methodology of ternary matrices were acquired from new databases such as Ensembl, UniProt, and APPRIS. Another innovation implemented in the pipeline is the exclusion of transcripts predicted to be susceptible to degradation through the NMD pathway. Taken together, our repository and its applications represent a valuable resource for the proteogenomics community.

Assuntos

Proteogenômica , Proteômica , Ratos , Camundongos , Humanos , Animais , Bases de Dados de Proteínas , Bases de Conhecimento , Proteoma/genética

Proteogenomics Reveals Orthologous Alternatively Spliced Proteoforms in the Same Human and Mouse Brain Regions with Differential Abundance in an Alzheimer's Disease Mouse Model.

da Silva, Esdras Matheus Gomes; Santos, Letícia Graziela Costa; de Oliveira, Flávia Santiago; Freitas, Flávia Cristina de Paula; Parreira, Vinícius da Silva Coutinho; Dos Santos, Hellen Geremias; Tavares, Raphael; Carvalho, Paulo Costa; Neves-Ferreira, Ana Gisele da Costa; Haibara, Andrea Siqueira; de Araujo-Souza, Patrícia Savio; Dias, Adriana Abalen Martins; Passetti, Fabio.

Cells ; 10(7)2021 06 23.

Artigo em Inglês | MEDLINE | ID: mdl-34201730

RESUMO

Alternative splicing (AS) may increase the number of proteoforms produced by a gene. Alzheimer's disease (AD) is a neurodegenerative disease with well-characterized AS proteoforms. In this study, we used a proteogenomics strategy to build a customized protein sequence database and identify orthologous AS proteoforms between humans and mice on publicly available shotgun proteomics (MS/MS) data of the corpus callosum (CC) and olfactory bulb (OB). Identical proteotypic peptides of six orthologous AS proteoforms were found in both species: PKM1 (gene PKM/Pkm), STXBP1a (gene STXBP1/Stxbp1), Isoform 3 (gene HNRNPK/Hnrnpk), LCRMP-1 (gene CRMP1/Crmp1), SP3 (gene CADM1/Cadm1), and PKCßII (gene PRKCB/Prkcb). These AS variants were also detected at the transcript level by publicly available RNA-Seq data and experimentally validated by RT-qPCR. Additionally, PKM1 and STXBP1a were detected at higher abundances in a publicly available MS/MS dataset of the AD mouse model APP/PS1 than its wild type. These data corroborate other reports, which suggest that PKM1 and STXBP1a AS proteoforms might play a role in amyloid-like aggregate formation. To the best of our knowledge, this report is the first to describe PKM1 and STXBP1a overexpression in the OB of an AD mouse model. We hope that our strategy may be of use in future human neurodegenerative studies using mouse models.

Assuntos

Processamento Alternativo/genética , Doença de Alzheimer/genética , Encéfalo/metabolismo , Proteogenômica , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Modelos Animais de Doenças , Éxons/genética , Humanos , Masculino , Camundongos Endogâmicos C57BL , Peptídeos/química , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA-Seq , Transcriptoma/genética

ExVe: The knowledge base of orthologous proteins identified in fungal extracellular vesicles.

Parreira, Vinícius da Silva Coutinho; Santos, Letícia Graziela Costa; Rodrigues, Marcio L; Passetti, Fabio.

Comput Struct Biotechnol J ; 19: 2286-2296, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33995920

RESUMO

Extracellular vesicles (EVs) are double-membrane particles associated with intercellular communication. Since the discovery of EV production in the fungus Cryptococcus neoformans, the importance of EV release in its physiology and pathogenicity has been investigated. To date, few studies have investigated the proteomic content of EVs from multiple fungal species. Our main objective was to use an orthology approach to compare proteins identified by EV shotgun proteomics in 8 pathogenic and 1 nonpathogenic species. Using protein information from the UniProt and FungiDB databases, we integrated data for 11,433 hits in fungal EVs with an orthology perspective, resulting in 3,834 different orthologous groups. OG6_100083 (Hsp70 Pfam domain) was the unique orthologous group that was identified for all fungal species. Proteins with this protein domain are associated with the stress response, survival and morphological changes in different fungal species. Although no pathogenic orthologous group was found, we identified 5 orthologous groups exclusive to S. cerevisiae. Using the criteria of at least 7 pathogenic fungi to define a cluster, we detected the 4 unique pathogenic orthologous groups. Taken together, our data suggest that Hsp70-related proteins might play a key role in fungal EVs, regardless of the pathogenic status. Using an orthology approach, we identified at least 4 protein domains that could be novel therapeutic targets against pathogenic fungi. Our results were compiled in the herein described ExVe database, which is publicly available at http://exve.icc.fiocruz.br.

RAFTS³G: an efficient and versatile clustering software to analyses in large protein datasets.

de Lima Nichio, Bruno Thiago; de Oliveira, Aryel Marlus Repula; de Pierri, Camilla Reginatto; Santos, Leticia Graziela Costa; Lejambre, Alexandre Quadros; Vialle, Ricardo Assunção; da Rocha Coimbra, Nilson Antônio; Guizelini, Dieval; Marchaukoski, Jeroniza Nunes; de Oliveira Pedrosa, Fabio; Raittz, Roberto Tadeu.

BMC Bioinformatics ; 20(1): 392, 2019 Jul 15.

Artigo em Inglês | MEDLINE | ID: mdl-31307371

RESUMO

BACKGROUND: Clustering methods are essential to partitioning biological samples being useful to minimize the information complexity in large datasets. Tools in this context usually generates data with greed algorithms that solves some Data Mining difficulties which can degrade biological relevant information during the clustering process. The lack of standardization of metrics and consistent bases also raises questions about the clustering efficiency of some methods. Benchmarks are needed to explore the full potential of clustering methods - in which alignment-free methods stand out - and the good choice of dataset makes it essentials. RESULTS: Here we present a new approach to Data Mining in large protein sequences datasets, the Rapid Alignment Free Tool for Sequences Similarity Search to Groups (RAFTS3G), a method to clustering aiming of losing less biological information in the processes of generation groups. The strategy developed in our algorithm is optimized to be more astringent which reflects increase in accuracy and sensitivity in the generation of clusters in a wide range of similarity. RAFTS3G is the better choice compared to three main methods when the user wants more reliable result even ignoring the ideal threshold to clustering. CONCLUSION: In general, RAFTS3G is able to group up to millions of biological sequences into large datasets, which is a remarkable option of efficiency in clustering. RAFTS3G compared to other "standard-gold" methods in the clustering of large biological data maintains the balance between the reduction of biological information redundancy and the creation of consistent groups. We bring the binary search concept applied to grouped sequences which shows maintaining sensitivity/accuracy relation and up to minimize the time of data generated with RAFTS3G process.

Assuntos

Proteínas/química , Software , Algoritmos , Análise por Conglomerados , Mineração de Dados , Bases de Dados de Proteínas

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA