Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38244571

RESUMO

MOTIVATION: Phosphorylation, a post-translational modification regulated by protein kinase enzymes, plays an essential role in almost all cellular processes. Understanding how each of the nearly 500 human protein kinases selectively phosphorylates their substrates is a foundational challenge in bioinformatics and cell signaling. Although deep learning models have been a popular means to predict kinase-substrate relationships, existing models often lack interpretability and are trained on datasets skewed toward a subset of well-studied kinases. RESULTS: Here we leverage recent peptide library datasets generated to determine substrate specificity profiles of 300 serine/threonine kinases to develop an explainable Transformer model for kinase-peptide interaction prediction. The model, trained solely on primary sequences, achieved state-of-the-art performance. Its unique multitask learning paradigm built within the model enables predictions on virtually any kinase-peptide pair, including predictions on 139 kinases not used in peptide library screens. Furthermore, we employed explainable machine learning methods to elucidate the model's inner workings. Through analysis of learned embeddings at different training stages, we demonstrate that the model employs a unique strategy of substrate prediction considering both substrate motif patterns and kinase evolutionary features. SHapley Additive exPlanation (SHAP) analysis reveals key specificity determining residues in the peptide sequence. Finally, we provide a web interface for predicting kinase-substrate associations for user-defined sequences and a resource for visualizing the learned kinase-substrate associations. AVAILABILITY AND IMPLEMENTATION: All code and data are available at https://github.com/esbgkannan/Phosformer-ST. Web server is available at https://phosformer.netlify.app.


Assuntos
Biblioteca de Peptídeos , Proteínas Quinases , Humanos , Proteínas Quinases/metabolismo , Fosforilação , Peptídeos/química , Aprendizado de Máquina
2.
PeerJ ; 11: e16087, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38077442

RESUMO

The Protein Kinase Ontology (ProKinO) is an integrated knowledge graph that conceptualizes the complex relationships among protein kinase sequence, structure, function, and disease in a human and machine-readable format. In this study, we have significantly expanded ProKinO by incorporating additional data on expression patterns and drug interactions. Furthermore, we have developed a completely new browser from the ground up to render the knowledge graph visible and interactive on the web. We have enriched ProKinO with new classes and relationships that capture information on kinase ligand binding sites, expression patterns, and functional features. These additions extend ProKinO's capabilities as a discovery tool, enabling it to uncover novel insights about understudied members of the protein kinase family. We next demonstrate the application of ProKinO. Specifically, through graph mining and aggregate SPARQL queries, we identify the p21-activated protein kinase 5 (PAK5) as one of the most frequently mutated dark kinases in human cancers with abnormal expression in multiple cancers, including a previously unappreciated role in acute myeloid leukemia. We have identified recurrent oncogenic mutations in the PAK5 activation loop predicted to alter substrate binding and phosphorylation. Additionally, we have identified common ligand/drug binding residues in PAK family kinases, underscoring ProKinO's potential application in drug discovery. The updated ontology browser and the addition of a web component, ProtVista, which enables interactive mining of kinase sequence annotations in 3D structures and Alphafold models, provide a valuable resource for the signaling community. The updated ProKinO database is accessible at https://prokino.uga.edu.


Assuntos
Neoplasias , Proteínas Quinases , Humanos , Proteínas Quinases/genética , Ligantes , Proteínas/genética , Fosforilação
3.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36692152

RESUMO

MOTIVATION: The human genome encodes over 500 distinct protein kinases which regulate nearly all cellular processes by the specific phosphorylation of protein substrates. While advances in mass spectrometry and proteomics studies have identified thousands of phosphorylation sites across species, information on the specific kinases that phosphorylate these sites is currently lacking for the vast majority of phosphosites. Recently, there has been a major focus on the development of computational models for predicting kinase-substrate associations. However, most current models only allow predictions on a subset of well-studied kinases. Furthermore, the utilization of hand-curated features and imbalances in training and testing datasets pose unique challenges in the development of accurate predictive models for kinase-specific phosphorylation prediction. Motivated by the recent development of universal protein language models which automatically generate context-aware features from primary sequence information, we sought to develop a unified framework for kinase-specific phosphosite prediction, allowing for greater investigative utility and enabling substrate predictions at the whole kinome level. RESULTS: We present a deep learning model for kinase-specific phosphosite prediction, termed Phosformer, which predicts the probability of phosphorylation given an arbitrary pair of unaligned kinase and substrate peptide sequences. We demonstrate that Phosformer implicitly learns evolutionary and functional features during training, removing the need for feature curation and engineering. Further analyses reveal that Phosformer also learns substrate specificity motifs and is able to distinguish between functionally distinct kinase families. Benchmarks indicate that Phosformer exhibits significant improvements compared to the state-of-the-art models, while also presenting a more generalized, unified, and interpretable predictive framework. AVAILABILITY AND IMPLEMENTATION: Code and data are available at https://github.com/esbgkannan/phosformer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas Quinases , Processamento de Proteína Pós-Traducional , Humanos , Fosforilação , Proteínas Quinases/metabolismo , Proteínas/metabolismo
4.
Glycobiology ; 31(11): 1472-1477, 2021 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-34351427

RESUMO

Glycosyltransferases (GTs) play a central role in sustaining all forms of life through the biosynthesis of complex carbohydrates. Despite significant strides made in recent years to establish computational resources, databases and tools to understand the nature and role of carbohydrates and related glycoenzymes, a data analytics framework that connects the sequence-structure-function relationships to the evolution of GTs is currently lacking. This hinders the characterization of understudied GTs and the synthetic design of GTs for medical and biotechnology applications. Here, we present GTXplorer as an integrated platform that presents evolutionary information of GTs adopting a GT-A fold in an intuitive format enabling in silico investigation through comparative sequence analysis to derive informed hypotheses about their function. The tree view mode provides an overview of the evolutionary relationships of GT-A families and allows users to select phylogenetically relevant families for comparisons. The selected families can then be compared in the alignment view at the residue level using annotated weblogo stacks of the GT-A core specific to the selected clade, family, or subfamily. All data are easily accessible and can be downloaded for further analysis. GTXplorer can be accessed at https://vulcan.cs.uga.edu/gtxplorer/ or from GitHub at https://github.com/esbgkannan/GTxplorer to deploy locally. By packaging multiple data streams into an accessible, user-friendly format, GTXplorer presents the first evolutionary data analytics platform for comparative glycomics.


Assuntos
Biologia Computacional , Glicosiltransferases/química , Biocatálise , Carboidratos/biossíntese , Carboidratos/química , Glicômica , Glicosiltransferases/metabolismo , Dobramento de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...