Pesquisa | Portal Regional da BVS (teste)

DR-BERT: A protein language model to annotate disordered regions.

Nambiar, Ananthan; Forsyth, John Malcolm; Liu, Simon; Maslov, Sergei.

Structure ; 2024 Apr 30.

Artigo em Inglês | MEDLINE | ID: mdl-38701796

RESUMO

Despite their lack of a rigid structure, intrinsically disordered regions (IDRs) in proteins play important roles in cellular functions, including mediating protein-protein interactions. Therefore, it is important to computationally annotate IDRs with high accuracy. In this study, we present Disordered Region prediction using Bidirectional Encoder Representations from Transformers (DR-BERT), a compact protein language model. Unlike most popular tools, DR-BERT is pretrained on unannotated proteins and trained to predict IDRs without relying on explicit evolutionary or biophysical data. Despite this, DR-BERT demonstrates significant improvement over existing methods on the Critical Assessment of protein Intrinsic Disorder (CAID) evaluation dataset and outperforms competitors on two out of four test cases in the CAID 2 dataset, while maintaining competitiveness in the others. This performance is due to the information learned during pretraining and DR-BERT's ability to use contextual information.

Transformer Neural Networks for Protein Family and Interaction Prediction Tasks.

Nambiar, Ananthan; Liu, Simon; Heflin, Maeve; Forsyth, John Malcolm; Maslov, Sergei; Hopkins, Mark; Ritz, Anna.

J Comput Biol ; 30(1): 95-111, 2023 01.

Artigo em Inglês | MEDLINE | ID: mdl-35950958

RESUMO

The scientific community is rapidly generating protein sequence information, but only a fraction of these proteins can be experimentally characterized. While promising deep learning approaches for protein prediction tasks have emerged, they have computational limitations or are designed to solve a specific task. We present a Transformer neural network that pre-trains task-agnostic sequence representations. This model is fine-tuned to solve two different protein prediction tasks: protein family classification and protein interaction prediction. Our method is comparable to existing state-of-the-art approaches for protein family classification while being much more general than other architectures. Further, our method outperforms other approaches for protein interaction prediction for two out of three different scenarios that we generated. These results offer a promising framework for fine-tuning the pre-trained sequence representations for other protein prediction tasks.

Assuntos

Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA