Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
1.
Mol Pharm ; 21(4): 1563-1590, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38466810

ABSTRACT

Understanding protein sequence and structure is essential for understanding protein-protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure-activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein-protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.


Subject(s)
Artificial Intelligence , Drug Discovery , Drug Discovery/methods , Drug Design , Machine Learning , Quantitative Structure-Activity Relationship
2.
Comput Methods Programs Biomed ; 244: 107955, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38064959

ABSTRACT

BACKGROUND AND OBJECTIVE: Protein-protein interaction (PPI) is a vital process in all living cells, controlling essential cell functions such as cell cycle regulation, signal transduction, and metabolic processes with broad applications that include antibody therapeutics, vaccines, and drug discovery. The problem of sequence-based PPI prediction has been a long-standing issue in computational biology. METHODS: We introduce MaTPIP, a cutting-edge deep-learning framework for predicting PPI. MaTPIP stands out due to its innovative design, fusing pre-trained Protein Language Model (PLM)-based features with manually curated protein sequence attributes, emphasizing the part-whole relationship by incorporating two-dimensional granular part (amino-acid) level features and one-dimensional whole-level (protein) features. What sets MaTPIP apart is its ability to integrate these features across three different input terminals seamlessly. MatPIP also includes a distinctive configuration of Convolutional Neural Network (CNN) with Transformer components for concurrent utilization of CNN and sequential characteristics in each iteration and a one-dimensional to two-dimensional converter followed by a unified embedding. The statistical significance of this classifier is validated using McNemar's test. RESULTS: MaTPIP outperformed the existing methods on both the Human PPI benchmark and cross-species PPI testing datasets, demonstrating its immense generalization capability for PPI prediction. We used seven diverse datasets with varying PPI target class distributions. Notably, within the novel PPI scenario, the most challenging category for Human PPI Benchmark, MaTPIP improves the existing state-of-the-art score from 74.1% to 78.6% (measured in Area under ROC Curve), from 23.2% to 32.8% (in average precision) and from 4.9% to 9.5% (in precision at 3% recall) for 50%, 10% and 0.3% target class distributions, respectively. In cross-species PPI evaluation, hybrid MaTPIP establishes a new benchmark score (measured in Area Under precision-recall curve) of 81.1% from the previous 60.9% for Mouse, 80.9% from 56.2% for Fly, 78.1% from 55.9% for Worm, 59.9% from 41.7% for Yeast, and 66.2% from 58.8% for E.coli. Our eXplainable AI-based assessment reveals an average contribution of different feature families per prediction on these datasets. CONCLUSIONS: MaTPIP mixes manually curated features with the feature extracted from the pre-trained PLM to predict sequence-based protein-protein association. Furthermore, MaTPIP demonstrates strong generalization capabilities for cross-species PPI predictions.


Subject(s)
Deep Learning , Humans , Animals , Mice , Neural Networks, Computer , Proteins/metabolism , Amino Acid Sequence , ROC Curve
3.
Biochim Biophys Acta Mol Basis Dis ; 1869(6): 166702, 2023 08.
Article in English | MEDLINE | ID: mdl-37044238

ABSTRACT

Chemoresistance is a primary cause of breast cancer treatment failure, and protein-protein interactions significantly contribute to chemoresistance during different stages of breast cancer progression. In pursuit of novel biomarkers and relevant protein-protein interactions occurring during the emergence of breast cancer chemoresistance, we used a computational predictive biological (CPB) approach. CPB identified associations of adhesion molecules with proteins connected with different breast cancer proteins associated with chemoresistance. This approach identified an association of Integrin ß1 (ITGB1) with chemoresistance and breast cancer stem cell markers. ITGB1 activated the Focal Adhesion Kinase (FAK) pathway promoting invasion, migration, and chemoresistance in breast cancer by upregulating Erk phosphorylation. FAK also activated Wnt/Sox2 signaling, which enhanced self-renewal in breast cancer. Activation of the FAK pathway by ITGB1 represents a novel mechanism linked to breast cancer chemoresistance, which may lead to novel therapies capable of blocking breast cancer progression by intervening in ITGB1-regulated signaling pathways.


Subject(s)
Breast Neoplasms , Integrin beta1 , Female , Humans , Biomarkers , Breast Neoplasms/drug therapy , Cell Line, Tumor , Drug Resistance, Neoplasm , Focal Adhesion Protein-Tyrosine Kinases/metabolism , Integrin beta1/metabolism
5.
Sci Rep ; 13(1): 4692, 2023 03 22.
Article in English | MEDLINE | ID: mdl-36949118

ABSTRACT

India had witnessed unprecedented surge in SARS-CoV-2 infections and its dire consequences during the second wave of COVID-19, but the detailed report of the epidemiological based spatiotemporal incidences of the disease is missing. In the manuscript, we have applied various statistical approaches (correlation, hierarchical clustering) to decipher the pattern of pathogenesis of the circulating VoCs responsible for surge in the incidences. B.1.617.1 (Kappa) was the predominant VoC during the early phase of the second wave, whereas, Delta (B.1.617.2) or Delta-like (AY.x) VoC constitutes majority ([Formula: see text]%) of the cases during the peak of the second wave. The correlation plot of Delta/Delta-like lineage demonstrates inverse correlation with other lineages including B.1.617.1, B.1.1.7, B.1, B.1.36.29 and B.1.36. The spatiotemporal analysis shows that most of the Indian states were affected during the peak of the second wave due to the Delta surge, and fall under the same cluster. The second cluster populated mostly by north-eastern states and the islands of India were minimally affected. The presence of signature mutations (T478K, D950N, E156G) along with L452K, D614G and P681R within the spike protein of Delta or Delta-like might cause elevation in the host cell attachment, increased transmission and altered antigenicity which in due course of time has replaced the other circulating variants.The timely assessment of new VoCs including Delta-like will provide a rationale for updating the diagnostic, vaccine development by medical industries and decision making by various agencies including government, educational institutions, and corporate industries.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Asian People , COVID-19/epidemiology , COVID-19/virology , India/epidemiology , Mutation , SARS-CoV-2/genetics
6.
J Biomol Struct Dyn ; 41(7): 2937-2946, 2023 04.
Article in English | MEDLINE | ID: mdl-35220920

ABSTRACT

De-novo protein design explores the untapped sequence space that is otherwise less discovered during the evolutionary process. This necessitates an efficient sequence space search engine for effective convergence in computational protein design. We propose a greedy simulated annealing-based Monte-Carlo parallel search algorithm for better sequence-structure compatibility probing in protein design. The guidance provided by the evolutionary profile, the greedy approach, and the cooling schedule adopted in the Monte Carlo simulation ensures sufficient exploration and exploitation of the search space leading to faster convergence. On evaluating the proposed algorithm, we find that a dataset of 76 target scaffolds report an average root-mean-square-deviation (RMSD) of 1.07 Å and an average TM-Score of 0.93 with the modeled designed protein sequences. High sequence recapitulation of 48.7% (59.4%) observed in the design sequences for all (hydrophobic) solvent-inaccessible residues again establish the goodness of the proposed algorithm. A high (93.4%) intra-group recapitulation of hydrophobic residues in the solvent-inaccessible region indicates that the proposed protein design algorithm preserves the core residues in the protein and provides alternative residue combinations in the solvent-accessible regions of the target protein. Furthermore, a COFACTOR-based protein functional analysis shows that the design sequences exhibit altered molecular functionality and introduce new molecular functions compared to the target scaffolds.Communicated by Ramaswamy H. Sarma.


Subject(s)
Proteins , Search Engine , Proteins/chemistry , Amino Acid Sequence , Computer Simulation , Solvents
7.
Cancer Lett ; 544: 215811, 2022 09 28.
Article in English | MEDLINE | ID: mdl-35787922

ABSTRACT

Fusion genes are abnormal genes resulting from chromosomal translocation, insertion, deletion, inversion, etc. ETV6, a rather promiscuous partner forms fusions with several other genes, most commonly, the NTRK3 gene. This fusion leads to the formation of a constitutively activated tyrosine kinase which activates the Ras-Raf-MEK and PI3K/AKT/MAPK pathways, leading the cells through cycles of uncontrolled division and ultimately resulting in cancer. Targeted therapies against this ETV6-NTRK3 fusion protein are much needed. Therefore, to find a targeted approach, a transcription factor RBPJ regulating the ETV6 gene was established and since the ETV6-NTRK3 fusion gene is downstream of the ETV6 promoter/enhancer, this fusion protein is also regulated. The regulation of the ETV6 gene via RBPJ was validated by ChIP analysis in human glioblastoma (GBM) cell lines and patient tissue samples. This study was further followed by the identification of an inhibitor, Furamidine, against transcription factor RBPJ. It was found to be binding with the DNA binding domain of RBPJ with antitumorigenic properties and minimal organ toxicity. Hence, a new target RBPJ, regulating the production of ETV6 and ETV6-NTRK3 fusion protein was found along with a potent RBPJ inhibitor Furamidine.


Subject(s)
DNA-Binding Proteins , Glioblastoma , DNA-Binding Proteins/genetics , Glioblastoma/drug therapy , Glioblastoma/genetics , Humans , Immunoglobulin J Recombination Signal Sequence-Binding Protein , Oncogene Proteins, Fusion/genetics , Oncogene Proteins, Fusion/metabolism , Phosphatidylinositol 3-Kinases/metabolism , Proto-Oncogene Proteins c-ets/genetics , Receptor, trkC/genetics , Receptor, trkC/metabolism , Repressor Proteins/chemistry , Repressor Proteins/genetics , Transcription Factors/genetics
8.
J Mol Model ; 28(6): 167, 2022 May 25.
Article in English | MEDLINE | ID: mdl-35612652

ABSTRACT

The modular organization of a cell which can be determined by its interaction network allows us to understand a mesh of cooperation among the functional modules. Therefore, cellular-level identification of functional modules aids in understanding the functional and structural characteristics of the biological network of a cell and also assists in determining or comprehending the evolutionary signal. We develop ProMoCell that performs real-time Web scraping for generating clusters of the cellular level functional units of an organism. ProMoCell constructs the Protein Locality Graphs and clusters the cellular level functional units of an organism by utilizing experimentally verified data from various online sources. Also, we develop ProModb, a database service that houses precomputed whole-cell protein-protein interaction network-based functional modules of an organism using ProMoCell. Our Web service is entirely synchronized with the KEGG pathway database and allows users to generate spatially localized protein modules for any organism belonging to the KEGG genome using its real-time Web scraping characteristics. Hence, the server will host as many organisms as is maintained by the KEGG database. Our Web services provide the users a comprehensive and integrated tool for an efficient browsing and extraction of the spatial locality-based protein locality graph and the functional modules constructed by gathering experimental data from several interaction databases and pathway maps. We believe that our Web services will be beneficial in pharmacological research, where a novel research domain called modular pharmacology has initiated the study on the diagnosis, prevention, and treatment of deadly diseases using functional modules.


Subject(s)
Algorithms , Proteins , Protein Interaction Mapping , Protein Interaction Maps
9.
J Biomol Struct Dyn ; 40(21): 11274-11290, 2022.
Article in English | MEDLINE | ID: mdl-34338141

ABSTRACT

Human familial prion diseases are known to be associated with different single-point mutants of the gene coding for prion protein with a primary focus at several locations of the globular domain. We have identified 12 different single-point pathogenic mutants of human prion protein (HuPrP) with the help of extensive perturbations/mutation technique at multiple locations of HuPrP sequence related to potentiality towards conformational disorders. Among these, some of the mutants include pathogenic variants that corroborate well with the literature reported proteins while majority include some unique single-point mutants that are either not explicitly studied early or studied for variants with different residues at the specific position. Primarily, our study sheds light on the unfolding mechanism of the above mentioned mutants in depth. Besides, we could identify some mutants under investigation that demonstrates not only unfolding of the helical structures but also extension and generation of the ß-sheet structures and or simultaneously have highly exposed hydrophobic surface which is assumed to be linked with the production of aggregate/fibril structures of the prion protein. Among the identified mutants, Q212E needs special attention due to its maximum exposure of hydrophobic core towards solvent and E200Q is found to be important due to its maximum extent of ß-content. We are also able to identify different respective structural conformations of the proteins according to their degree of structural unfolding and those conformations can be extracted and further studied in detail. Communicated by Ramaswamy H. Sarma.


Subject(s)
Prion Diseases , Prions , Humans , Prion Proteins/genetics , Prion Proteins/chemistry , Prions/genetics , Thermodynamics
10.
Proteins ; 90(3): 658-669, 2022 03.
Article in English | MEDLINE | ID: mdl-34651333

ABSTRACT

Given a target protein structure, the prime objective of protein design is to find amino acid sequences that will fold/acquire to the given three-dimensional structure. The protein design problem belongs to the non-deterministic polynomial-time-hard class as sequence search space increases exponentially with protein length. To ensure better search space exploration and faster convergence, we propose a protein modularity-based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU). Here, we have incorporated a divide-and-conquer approach where a protein is split into PUs and each PU region is explored in a parallel fashion. It has been further analyzed that our shared memory implementation of modularity-based parallel sequence search leads to better search space exploration compared to the case of traditional full protein design. Sequence-based analysis on design sequences depicts an average of 39.7% sequence similarity on the benchmark data set. Structure-based comparison of the modeled structures of the design protein with the target structure exhibited an average root-mean-square deviation of 1.17 Å and an average template modeling score of 0.89. The selected modeled structures of the design protein sequences are validated using 100 ns molecular dynamics simulations where 80% of the proteins have shown better or similar stability to the respective target proteins. Our study informs that our modularity-based protein design algorithm can be extended to protein interaction design as well.


Subject(s)
Proteins/chemistry , Algorithms , Amino Acid Sequence , Benchmarking , Computational Biology , Databases, Protein , Molecular Dynamics Simulation , Protein Conformation , Structure-Activity Relationship
11.
J Mol Biol ; 433(19): 167149, 2021 09 17.
Article in English | MEDLINE | ID: mdl-34271012

ABSTRACT

Infectious diseases in humans appear to be one of the most primary public health issues. Identification of novel disease-associated proteins will furnish an efficient recognition of the novel therapeutic targets. Here, we develop a Graph Convolutional Network (GCN)-based model called PINDeL to identify the disease-associated host proteins by integrating the human Protein Locality Graph and its corresponding topological features. Because of the amalgamation of GCN with the protein interaction network, PINDeL achieves the highest accuracy of 83.45% while AUROC and AUPRC values are 0.90 and 0.88, respectively. With high accuracy, recall, F1-score, specificity, AUROC, and AUPRC, PINDeL outperforms other existing machine-learning and deep-learning techniques for disease gene/protein identification in humans. Application of PINDeL on an independent dataset of 24320 proteins, which are not used for training, validation, or testing purposes, predicts 6448 new disease-protein associations of which we verify 3196 disease-proteins through experimental evidence like disease ontology, Gene Ontology, and KEGG pathway enrichment analyses. Our investigation informs that experimentally-verified 748 proteins are indeed responsible for pathogen-host protein interactions of which 22 disease-proteins share their association with multiple diseases such as cancer, aging, chem-dependency, pharmacogenomics, normal variation, infection, and immune-related diseases. This unique Graph Convolution Network-based prediction model is of utmost use in large-scale disease-protein association prediction and hence, will provide crucial insights on disease pathogenesis and will further aid in developing novel therapeutics.


Subject(s)
Biomarkers/metabolism , Communicable Diseases/metabolism , Protein Interaction Mapping/methods , Deep Learning , Genetic Association Studies , Humans , Neural Networks, Computer , Protein Interaction Maps
12.
Proteins ; 89(10): 1353-1364, 2021 10.
Article in English | MEDLINE | ID: mdl-34076296

ABSTRACT

Protein interactions and their assemblies assist in understanding the cellular mechanisms through the knowledge of interactome. Despite recent advances, a vast number of interacting protein complexes is not annotated by three-dimensional structures. Therefore, a computational framework is a suitable alternative to fill the large gap between identified interactions and the interactions with known structures. In this work, we develop an automated computational framework for modeling functionally related protein-complex structures utilizing GO-based semantic similarity technique and co-evolutionary information of the interaction sites. The framework can consider protein sequence and structure information as input and employ both rigid-body docking and template-based modeling exploiting the existing structural templates and sequence homology information from the PDB. Our framework combines geometric as well as physicochemical features for re-ranking the docking decoys. The proposed framework has an 83% success rate when tested on a benchmark dataset while considering Top1 models for template-based modeling and Top10 models for the docking pipeline. We believe that our computational framework can be used for any pair of proteins with higher confidence to identify the functional protein-protein interactions.


Subject(s)
Computational Biology/methods , Proteins/chemistry , Binding Sites , Databases, Protein , Protein Binding , Protein Interaction Mapping , Software , Structural Homology, Protein
13.
J Chem Inf Model ; 61(3): 1481-1492, 2021 03 22.
Article in English | MEDLINE | ID: mdl-33683902

ABSTRACT

One of the grand challenges of this century is modeling and simulating a whole cell. Extreme regulation of an extensive quantity of model and simulation data during whole-cell modeling and simulation renders it a computationally expensive research problem in systems biology. In this article, we present a high-performance whole-cell simulation exploiting modular cell biology principles. We prepare the simulation by dividing the unicellular bacterium, Escherichia coli (E. coli), into subcells utilizing the spatially localized densely connected protein clusters/modules. We set up a Brownian dynamics-based parallel whole-cell simulation framework by utilizing the Hamiltonian mechanics-based equations of motion. Though the velocity Verlet integration algorithm possesses the capability of solving the equations of motion, it lacks the ability to capture and deal with particle-collision scenarios. Hence, we propose an algorithm for detecting and resolving both elastic and inelastic collisions and subsequently modify the velocity Verlet integrator by incorporating our algorithm into it. Also, we address the boundary conditions to arrest the molecules' motion outside the subcell. For efficiency, we define one hashing-based data structure called the cellular dictionary to store all of the subcell-related information. A benchmark analysis of our CUDA C/C++ simulation code when tested on E. coli using the CPU-GPU cluster indicates that the computational time requirement decreases with the increase in the number of computing cores and becomes stable at around 128 cores. Additional testing on higher organisms such as rats and humans informs us that our proposed work can be extended to any organism and is scalable for high-end CPU-GPU clusters.


Subject(s)
Computer Graphics , Escherichia coli , Algorithms , Animals , Computer Simulation , Proteins , Rats
14.
Article in English | MEDLINE | ID: mdl-31329126

ABSTRACT

Protein design, also known as the inverse protein folding problem, is the identification of a protein sequence that folds into a target protein structure. Protein design is proved as an NP-hard problem. While researchers are working on designing heuristics with an emphasis on new scoring functions, we propose a replica-exchange Monte Carlo (REMC) search algorithm that ensures faster convergence using a greedy strategy. Using biological insights, we construct an evolutionary profile to encode the amino acid variability in different positions of the target protein from its structural homologs. The evolutionary profile guides the REMC search, and the greedy approach confirms appreciable exploration and exploitation of the sequence-structure fitness surface. We allow termination of a simulation trajectory once stagnant situation is detected. A series of sequence and structure level validations establish the goodness of our design. On a benchmark dataset, our algorithm reports an average root-mean-square deviation of 1.21Å between the target and the design proteins when modeled with an existing protein folding software. Besides, our algorithm assures 6.16 times overall speedup. In Molecular Dynamics simulations, we observe that four out of selected five design proteins report better to comparable stability to the corresponding target proteins.


Subject(s)
Algorithms , Computational Biology/methods , Molecular Dynamics Simulation , Protein Folding , Proteins , Monte Carlo Method , Protein Conformation , Proteins/chemistry , Proteins/genetics , Proteins/metabolism
15.
J Chem Inf Model ; 60(12): 6679-6690, 2020 12 28.
Article in English | MEDLINE | ID: mdl-33225697

ABSTRACT

Insertions/deletions of amino acids in the protein backbone potentially result in altered structural/functional specifications. They can either contribute positively to the evolutionary process or can result in disease conditions. Despite being the second most prevalent form of protein modification, there are no databases or computational frameworks that delineate harmful multipoint deletions (MPD) from beneficial ones. We introduce a positive unlabeled learning-based prediction framework (PROFOUND) that utilizes fold-level attributes, environment-specific properties, and deletion site-specific properties to predict the change in foldability arising from such MPDs, both in the non-loop and loop regions of protein structures. In the absence of any protein structure dataset to study MPDs, we introduce a dataset with 153 MPD instances that lead to native-like folded structures and 7650 unlabeled MPD instances whose effect on the foldability of the corresponding proteins is unknown. PROFOUND on 10-fold cross-validation on our newly introduced dataset reports a recall of 82.2% (86.6%) and a fall out rate (FR) of 14.2% (20.6%), corresponding to MPDs in the protein loop (non-loop) region. The low FR suggests that the foldability in proteins subject to MPDs is not random and necessitates unique specifications of the deleted region. In addition, we find that additional evolutionary attributes contribute to higher recall and lower FR. The first of a kind foldability prediction system owing to MPD instances and the newly introduced dataset will potentially aid in novel protein engineering endeavors.


Subject(s)
Amino Acids , Proteins , Protein Engineering , Protein Folding , Proteins/genetics
16.
J Proteome Res ; 19(11): 4533-4542, 2020 11 06.
Article in English | MEDLINE | ID: mdl-32871072

ABSTRACT

The Viral Protein 35 (VP35), a crucial protein of the Zaire Ebolavirus (EBOV), interacts with a plethora of human proteins to cripple the human immune system. Despite its importance, the entire structure of the tetrameric assembly of EBOV VP35 and the means by which it antagonizes the autophosphorylation of the kinase domain of human protein kinase R (PKRK) is still elusive. We consult existing structural information to model a tetrameric assembly of the VP35 protein where 93% of the protein is modeled using crystal structure templates. We analyze our modeled tetrameric structure to identify interchain bonding networks and use molecular dynamics simulations and normal-mode analysis to unravel the flexibility and deformability of the different regions of the VP35 protein. We establish that the C-terminal of VP35 (VP35C) directly interacts with PKRK to prevent it from autophosphorylation. Further, we identify three plausible VP35C-PKRK complexes with better affinity than the PKRK dimer formed during autophosphorylation and use protein design to establish a new stretch in VP35C that interacts with PKRK. The proposed tetrameric assembly will aid in better understanding of the VP35 protein, and the reported VP35C-PKRK complexes along with their interacting sites will help in the shortlisting of small molecule inhibitors.


Subject(s)
Ebolavirus , Hemorrhagic Fever, Ebola , Humans , Nucleocapsid Proteins , Viral Proteins
17.
Biochim Biophys Acta Rev Cancer ; 1874(1): 188389, 2020 08.
Article in English | MEDLINE | ID: mdl-32659251

ABSTRACT

ETV6 (translocation-Ets-leukemia virus) gene is a transcriptional repressor mainly involved in haematopoiesis and maintenance of vascular networks and has developed to be a major oncogene with the potential ability of forming fusion partners with many other genes with carcinogenic consequences. ETV6 fusions function primarily by constitutive activation of kinase activity of the fusion partners, modifications in the normal functions of ETV6 transcription factor, loss of function of ETV6 or the partner gene and activation of a proto-oncogene near the site of translocation. The role of ETV6 fusion gene in tumorigenesis has been well-documented and more variedly found in haematological malignancies. However, the role of the ETV6 oncogene in solid tumors has also risen to prominence due to an increasing number of cases being reported with this malignancy. Since, solid tumors can be well-targeted, the diagnosis of this genre of tumors based on ETV6 malignancy is of crucial importance for treatment. This review highlights the important ETV6 associated fusions in solid tumors along with critical insights as to existing and novel means of targeting it. A consolidation of novel therapies such as immune, gene, RNAi, stem cell therapy and protein degradation hitherto unused in the case of ETV6 solid tumor malignancies may open further therapeutic avenues.


Subject(s)
Neoplasms/genetics , Oncogene Proteins, Fusion/genetics , Proto-Oncogene Proteins c-ets/genetics , Repressor Proteins/genetics , Antineoplastic Agents/therapeutic use , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Chromosome Aberrations , Humans , Molecular Targeted Therapy , Mutation , Neoplasms/pathology , Neoplasms/therapy , Oncogene Proteins, Fusion/metabolism , Proto-Oncogene Mas , Proto-Oncogene Proteins c-ets/metabolism , Repressor Proteins/metabolism , ETS Translocation Variant 6 Protein
18.
J Chem Inf Model ; 60(6): 3315-3323, 2020 06 22.
Article in English | MEDLINE | ID: mdl-32401507

ABSTRACT

Nonsynonymous single-nucleotide polymorphisms often result in altered protein stability while playing crucial roles both in the evolution process and in the development of human diseases. Prediction of change in the thermodynamic stability due to such missense mutations will help in protein engineering endeavors and will contribute to a better understanding of different disease conditions. Here, we develop a machine-learning-based framework, viz., ProTSPoM, to estimate the change in protein thermodynamic stability arising out of single-point mutations (SPMs). ProTSPoM outperforms existing methods on the S2648 and S1925 databases and reports a Pearson correlation coefficient of 0.82 (0.88) and a root-mean-squared-error of 0.92 (1.06) kcal/mol between the predicted and experimental ΔΔG values on the long-established S350 (tumor suppressor p53 protein) data set. Further, we estimate the change in thermodynamic stability for all possible SPMs in the DNA binding domain of the p53 protein. We identify single-nucleotide polymorphisms in p53 which are plausibly detrimental to its structural integrity and interaction affinity with the DNA molecule. ProTSPoM with its reliable estimates and time-efficient prediction is well suited to be integrated with existing protein engineering techniques. The ProTSPoM web server is accessible at http://cosmos.iitkgp.ac.in/ProTSPoM/.


Subject(s)
Point Mutation , Tumor Suppressor Protein p53 , Humans , Mutation , Protein Stability , Thermodynamics , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Protein p53/metabolism
19.
Proteins ; 88(2): 284-291, 2020 02.
Article in English | MEDLINE | ID: mdl-31412138

ABSTRACT

Protein phosphorylation is one of the essential posttranslation modifications playing a vital role in the regulation of many fundamental cellular processes. We propose a LightGBM-based computational approach that uses evolutionary, geometric, sequence environment, and amino acid-specific features to decipher phosphate binding sites from a protein sequence. Our method, while compared with other existing methods on 2429 protein sequences taken from standard Phospho.ELM (P.ELM) benchmark data set featuring 11 organisms reports a higher F1 score = 0.504 (harmonic mean of the precision and recall) and ROC AUC = 0.836 (area under the curve of the receiver operating characteristics). The computation time of our proposed approach is much less than that of the recently developed deep learning-based framework. Structural analysis on selected protein sequences informs that our prediction is the superset of the phosphorylation sites, as mentioned in P.ELM data set. The foundation of our scheme is manual feature engineering and a decision tree-based classification. Hence, it is intuitive, and one can interpret the final tree as a set of rules resulting in a deeper understanding of the relationships between biophysical features and phosphorylation sites. Our innovative problem transformation method permits more control over precision and recall as is demonstrated by the fact that if we incorporate output probability of the existing deep learning framework as an additional feature, then our prediction improves (F1 score = 0.546; ROC AUC = 0.849). The implementation of our method can be accessed at http://cse.iitkgp.ac.in/~pralay/resources/PPSBoost/ and is mirrored at https://cosmos.iitkgp.ac.in/PPSBoost.


Subject(s)
Computational Biology/methods , Machine Learning , Protein Processing, Post-Translational , Proteins/chemistry , Sequence Analysis, Protein/methods , Algorithms , Animals , Binding Sites , Databases, Protein , Humans , Models, Molecular , Phosphorylation , Protein Conformation , Proteins/metabolism , Reproducibility of Results , Serine/chemistry , Serine/metabolism , Species Specificity , Threonine/chemistry , Threonine/metabolism , Tyrosine/chemistry , Tyrosine/metabolism
20.
Biochim Biophys Acta Gen Subj ; 1863(7): 1196-1209, 2019 07.
Article in English | MEDLINE | ID: mdl-31028823

ABSTRACT

BACKGROUND: Epithelial to mesenchymal transition (EMT) and extracellular matrix (ECM) remodeling, are the two elemental processes promoting glioblastoma (GBM). In the present work we propose a mechanistic modelling of GBM and in process establish a hypothesis elucidating critical crosstalk between heat shock proteins (HSPs) and matrix metalloproteinases (MMPs) with synergistic upregulation of EMT-like process and ECM remodeling. METHODS: The interaction and the precise binding site between the HSP and MMP proteins was assayed computationally, in-vitro and in GBM clinical samples. RESULTS: A positive crosstalk of HSP27 with MMP-2 and MMP-9 was established in both GBM patient tissues and cell-lines. This association was found to be of prime significance for ECM remodeling and promotion of EMT-like characteristics. In-silico predictions revealed 3 plausible interaction sites of HSP27 interacting with MMP-2 and MMP-9. Site-directed mutagenesis followed by in-vitro immunoprecipitation assay (IP) with 3 mutated recombinant HSP27, confirmed an interface stretch containing residues 29-40 of HSP27 to be a common interaction site for both MMP-2 and MMP-9. This was further validated with in-vitro IP of truncated (sans AA 29-40) recombinant HSP27 with MMP-2 and MMP-9. CONCLUSION: The association of HSP27 with MMP-2 and MMP-9 proteins along with the identified interacting stretch has the potential to contribute towards drug development to inhibit GBM infiltration and migration. GENERAL SIGNIFICANCE: Current findings provide a novel therapeutic target for GBM opening a new horizon in the field of GBM management.


Subject(s)
Brain Neoplasms/therapy , Glioblastoma/therapy , HSP27 Heat-Shock Proteins/metabolism , Matrix Metalloproteinase 2/metabolism , Matrix Metalloproteinase 8/metabolism , Brain Neoplasms/metabolism , Brain Neoplasms/pathology , Cell Line, Tumor , Disease Progression , Glioblastoma/metabolism , Glioblastoma/pathology , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...