Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38990747

RESUMO

Deep learning approaches, such as convolution neural networks (CNNs) and deep recurrent neural networks (RNNs), have been the backbone for predicting protein function, with promising state-of-the-art (SOTA) results. RNNs with an in-built ability (i) focus on past information, (ii) collect both short-and-long range dependency information, and (iii) bi-directional processing offers a strong sequential processing mechanism. CNNs, however, are confined to focusing on short-term information from both the past and the future, although they offer parallelism. Therefore, a novel bi-directional CNN that strictly complies with the sequential processing mechanism of RNNs is introduced and is used for developing a protein function prediction framework, Bi-SeqCNN. This is a sub-sequence-based framework. Further, Bi-SeqCNN + is an ensemble approach to better the prediction results. To our knowledge, this is the first time bi-directional CNNs are employed for general temporal data analysis and not just for protein sequences. The proposed architecture produces improvements up to +5.5% over contemporary SOTA methods on three benchmark protein sequence datasets. Moreover, it is substantially lighter and attain these results with (0.50-0.70 times) fewer parameters than the SOTA methods.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38843056

RESUMO

Proteins are represented in various ways, each contributing differently to protein-related tasks. Here, information from each representation (protein sequence, 3D structure, and interaction data) is combined for an efficient protein function prediction task. Recently, uni-modal has produced promising results with state-of-the-art attention mechanisms that learn the relative importance of features, whereas multi-modal approaches have produced promising results by simply concatenating obtained features using a computational approach from different representations which leads to an increase in the overall trainable parameters. In this paper, we propose a novel, light-weight cross-modal multi-attention (CrMoMulAtt) mechanism that captures the relative contribution of each modality with a lower number of trainable parameters. The proposed mechanism shows a higher contribution from PPI and a lower contribution from structure data. The results obtained from the proposed CrossPredGO mechanism demonstrate an increment in Fmax in the range of +(3.29 to 7.20)% with at most 31% lower trainable parameters compared with DeepGO and MultiPredGO.

3.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 2242-2253, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37022217

RESUMO

The short-and-long range interactions amongst amino-acids in a protein sequence are primarily responsible for the function performed by the protein. Recently convolutional neural network (CNN)s have produced promising results on sequential data including those of NLP tasks and protein sequences. However, CNN's strength primarily lies at capturing short range interactions and are not so good at long range interactions. On the other hand, dilated CNNs are good at capturing both short-and-long range interactions because of varied - short-and-long - receptive fields. Further, CNNs are quite light-weight in terms of trainable parameters, whereas most existing deep learning solutions for protein function prediction (PFP) are based on multi-modality and are rather complex and heavily parametrized. In this paper, we propose a (sub-sequence + dilated-CNNs)-based simple, light-weight and sequence-only PFP framework Lite-SeqCNN. By varying dilation-rates, Lite-SeqCNN efficiently captures both short-and-long range interactions and has (0.50-0.75 times) fewer trainable parameters than its contemporary deep learning models. Further, Lite-SeqCNN + is an ensemble of three Lite-SeqCNNs developed with different segment-sizes that produces even better results compared to the individual models. The proposed architecture produced improvements upto 5% over state-of-the-art approaches Global-ProtEnc Plus, DeepGOPlus, and GOLabeler on three different prominent datasets curated from the UniProt database.


Assuntos
Redes Neurais de Computação , Proteínas , Proteínas/genética , Sequência de Aminoácidos , Bases de Dados Factuais , Aminoácidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...