Search | VHL Regional Portal

1.

Most Monogenic Disorders Are Caused by Mutations Altering Protein Folding Free Energy.

Pandey, Preeti; Alexov, Emil.

Int J Mol Sci ; 25(4)2024 Feb 06.

Article in English | MEDLINE | ID: mdl-38396641

ABSTRACT

Revealing the molecular effect that pathogenic missense mutations have on the corresponding protein is crucial for developing therapeutic solutions. This is especially important for monogenic diseases since, for most of them, there is no treatment available, while typically, the treatment should be provided in the early development stages. This requires fast targeted drug development at a low cost. Here, we report an updated database of monogenic disorders (MOGEDO), which includes 768 proteins and the corresponding 2559 pathogenic and 1763 benign mutations, along with the functional classification of the corresponding proteins. Using the database and various computational tools that predict folding free energy change (ΔΔG), we demonstrate that, on average, 70% of pathogenic cases result in decreased protein stability. Such a large fraction indicates that one should aim at in silico screening for small molecules stabilizing the structure of the mutant protein. We emphasize that knowledge of ΔΔG is essential because one wants to develop stabilizers that compensate for ΔΔG, but do not make protein over-stable, since over-stable protein may be dysfunctional. We demonstrate that, by using ΔΔG and predicted solvent exposure of the mutation site, one can develop a predictive method that distinguishes pathogenic from benign mutations with a success rate even better than some of the leading pathogenicity predictors. Furthermore, hydrophobic-hydrophobic mutations have stronger correlations between folding free energy change and pathogenicity compared with others. Also, mutations involving Cys, Gly, Arg, Trp, and Tyr amino acids being replaced by any other amino acid are more likely to be pathogenic. To facilitate further detection of pathogenic mutations, the wild type of amino acids in the 768 proteins mentioned above was mutated to other 19 residues (14,847,817 mutations), the ΔΔG was calculated with SAAFEC-SEQ, and 5,506,051 mutations were predicted to be pathogenic.

Subject(s)

Protein Folding , Proteins , Thermodynamics , Proteins/chemistry , Mutation , Protein Stability , Amino Acids/genetics

2.

Most monogenic disorders are caused by mutations altering protein folding free energy.

Pandey, Preeti; Alexov, Emil.

Res Sq ; 2023 Oct 19.

Article in English | MEDLINE | ID: mdl-37886551

ABSTRACT

Revealing the molecular effect that pathogenic missense mutations cause on the corresponding protein is crucial for developing therapeutic solutions. This is especially important for monogenic diseases since, for most of them, there is no treatment available, while typically, the treatment should be provided in the early development stages. This requires fast, targeted drug development at a low cost. Here, we report a database of monogenic disorders (MOGEDO), which includes 768 proteins, the corresponding 2559 pathogenic and 1763 benign mutations, along with the functional classification of the corresponding proteins. Using the database and various computational tools that predict folding free energy change (ΔΔG), we demonstrate that, on average, 70% of pathogenic cases result in decreased protein stability. Such a large fraction indicates that one should aim at in-silico screening for small molecules stabilizing the structure of the mutant protein. We emphasize that knowledge of ΔΔG is essential because one wants to develop stabilizers that compensate for ΔΔG but not to make protein over-stable since over-stable protein may be dysfunctional. We demonstrate that using ΔΔG and predicted solvent exposure of the mutation site; one can develop a predictive method that distinguishes pathogenic from benign mutation with a success rate even better than some of the leading pathogenicity predictors. Furthermore, hydrophobic-hydrophobic mutations have stronger correlations between folding free energy change and pathogenicity compared with others. Also, mutations involving Cys, Gly, Arg, Trp and Tyr amino acids being replaced by any other amino acid are more likely to be pathogenic. To facilitate further detection of pathogenic mutations, the wild type of amino acids in the 768 proteins mentioned above was mutated to other 19 residues (14,847,817 mutations), and the ΔΔG was calculated with SAAFEC-SEQ, and 5,506,051 mutations were predicted to be pathogenic.

3.

Predicting the Effect of Single Mutations on Protein Stability and Binding with Respect to Types of Mutations.

Pandey, Preeti; Panday, Shailesh Kumar; Rimal, Prawin; Ancona, Nicolas; Alexov, Emil.

Int J Mol Sci ; 24(15)2023 Jul 28.

Article in English | MEDLINE | ID: mdl-37569449

ABSTRACT

The development of methods and algorithms to predict the effect of mutations on protein stability, protein-protein interaction, and protein-DNA/RNA binding is necessitated by the needs of protein engineering and for understanding the molecular mechanism of disease-causing variants. The vast majority of the leading methods require a database of experimentally measured folding and binding free energy changes for training. These databases are collections of experimental data taken from scientific investigations typically aimed at probing the role of particular residues on the above-mentioned thermodynamic characteristics, i.e., the mutations are not introduced at random and do not necessarily represent mutations originating from single nucleotide variants (SNV). Thus, the reported performance of the leading algorithms assessed on these databases or other limited cases may not be applicable for predicting the effect of SNVs seen in the human population. Indeed, we demonstrate that the SNVs and non-SNVs are not equally presented in the corresponding databases, and the distribution of the free energy changes is not the same. It is shown that the Pearson correlation coefficients (PCCs) of folding and binding free energy changes obtained in cases involving SNVs are smaller than for non-SNVs, indicating that caution should be used in applying them to reveal the effect of human SNVs. Furthermore, it is demonstrated that some methods are sensitive to the chemical nature of the mutations, resulting in PCCs that differ by a factor of four across chemically different mutations. All methods are found to underestimate the energy changes by roughly a factor of 2.

Subject(s)

Algorithms , Polymorphism, Single Nucleotide , Humans , Mutation , Protein Stability

4.

PKAD-2: New entries and expansion of functionalities of the database of experimentally measured pKa's of proteins.

Ancona, Nicolas; Bastola, Ananta; Alexov, Emil.

J Comput Biophys Chem ; 22(5): 515-524, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37520074

ABSTRACT

Almost all biological reactions are pH dependent and understanding the origin of pH dependence requires knowledge of the pKa's of ionizable groups. Here we report a new edition of PKAD, the PKAD-2, which is a database of experimentally measured pKa's of proteins, both wild type and mutant proteins. The new additions include 117 wild type and 54 mutant pKa values, resulting in total 1742 experimentally measured pKa's. The new edition of PKAD-2 includes 8 new wild type and 12 new mutant proteins, resulting in total of 220 proteins. This new edition incorporates a visual 3D image of the highlighted residue of interest within the corresponding protein or protein complex. Hydrogen bonds were identified, counted, and implemented as a search feature. Other new search features include the number of neighboring residues <4A from the heaviest atom of the side chain of a given amino acid. Here, we present PKAD-2 with the intention to continuously incorporate novel features and current data with the goal to be used as benchmark for computational methods.

5.

On the linkage of thermodynamics and pathogenicity.

Pandey, Preeti; Ghimire, Sanjeev; Wu, Bohua; Alexov, Emil.

Curr Opin Struct Biol ; 80: 102572, 2023 06.

Article in English | MEDLINE | ID: mdl-36965249

ABSTRACT

This review outlines the effect of disease-causing mutations on proteins' thermodynamics. Two major thermodynamics quantities, which are essential for structural integrity, the folding and binding free energy changes caused by missense mutations, are considered. It is emphasized that disease effects in case of complex diseases may originate from several mutations over several genes, while monogenic diseases are caused by mutation is a single gene. Nevertheless, in both cases it is shown that pathogenic mutations cause larger perturbations of the above-mentioned thermodynamics quantities as compared with the benign mutations. Recent works demonstrating the effect of pathogenic mutations on the above-mentioned thermodynamics quantities, as well as on structural dynamics and allosteric pathways, are reviewed.

Subject(s)

Protein Folding , Proteins , Virulence , Mutation , Proteins/chemistry , Thermodynamics

6.

Electrostatics in Computational Biophysics and Its Implications for Disease Effects.

Sun, Shengjie; Poudel, Pitambar; Alexov, Emil; Li, Lin.

Int J Mol Sci ; 23(18)2022 Sep 07.

Article in English | MEDLINE | ID: mdl-36142260

ABSTRACT

This review outlines the role of electrostatics in computational molecular biophysics and its implication in altering wild-type characteristics of biological macromolecules, and thus the contribution of electrostatics to disease mechanisms. The work is not intended to review existing computational approaches or to propose further developments. Instead, it summarizes the outcomes of relevant studies and provides a generalized classification of major mechanisms that involve electrostatic effects in both wild-type and mutant biological macromolecules. It emphasizes the complex role of electrostatics in molecular biophysics, such that the long range of electrostatic interactions causes them to dominate all other forces at distances larger than several Angstroms, while at the same time, the alteration of short-range wild-type electrostatic pairwise interactions can have pronounced effects as well. Because of this dual nature of electrostatic interactions, being dominant at long-range and being very specific at short-range, their implications for wild-type structure and function are quite pronounced. Therefore, any disruption of the complex electrostatic network of interactions may abolish wild-type functionality and could be the dominant factor contributing to pathogenicity. However, we also outline that due to the plasticity of biological macromolecules, the effect of amino acid mutation may be reduced, and thus a charge deletion or insertion may not necessarily be deleterious.

Subject(s)

Amino Acids , Proteins , Biophysics , Proteins/chemistry , Static Electricity

7.

Protein-Protein Binding Free Energy Predictions with the MM/PBSA Approach Complemented with the Gaussian-Based Method for Entropy Estimation.

Panday, Shailesh Kumar; Alexov, Emil.

ACS Omega ; 7(13): 11057-11067, 2022 Apr 05.

Article in English | MEDLINE | ID: mdl-35415339

ABSTRACT

Here, we present a Gaussian-based method for estimation of protein-protein binding entropy to augment the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method for computational prediction of binding free energy (ΔG). The method is termed f5-MM/PBSA/E, where "E" stands for entropy and f5 for five adjustable parameters. The enthalpy components of ΔG (molecular mechanics, polar and non-polar solvation energies) are computed from a single implicit solvent generalized Born (GB) energy minimized structure of a protein-protein complex, while the binding entropy is computed using independently GB energy minimized unbound and bound structures. It should be emphasized that the f5-MM/PBSA/E method does not use snapshots, just energy minimized structures, and is thus very fast and computationally efficient. The method is trained and benchmarked in 5-fold validation test over a data set consisting of 46 protein-protein binding cases with experimentally determined dissociation constant K d values. This data set has been used for benchmarking in recently published protein-protein binding studies that apply conventional MM/PBSA and MM/PBSA with an enhanced sampling method. The f5-MM/PBSA/E tested on the same data set achieves similar or better performance than these computationally demanding approaches, making it an excellent choice for high throughput protein-protein binding affinity prediction studies.

8.

Opioid Addiction and Opioid Receptor Dimerization: Structural Modeling of the OPRD1 and OPRM1 Heterodimer and Its Signaling Pathways.

Wu, Bohua; Hand, William; Alexov, Emil.

Int J Mol Sci ; 22(19)2021 Sep 24.

Article in English | MEDLINE | ID: mdl-34638633

ABSTRACT

Opioid addiction is a complex phenomenon with genetic, social, and other components. Due to such complexity, it is difficult to interpret the outcome of clinical studies, and thus, mutations found in individuals with these addictions are still not indisputably classified as opioid addiction-causing variants. Here, we computationally investigated two such mutations, A6V and N40D, found in the mu opioid receptor gene OPRM1. The mutations are located in the extracellular domain of the corresponding protein, which is important to the hetero-dimerization of OPRM1 with the delta opioid receptor protein (OPRD1). The hetero-dimerization of OPRD1-OPRM1 affects the signaling pathways activated by opioids and natural peptides and, thus, could be considered a factor contributing to addiction. In this study, we built four 3D structures of molecular pathways, including the G-protein signaling pathway and the ß-arrestin signaling pathway of the heterodimer of OPRD1-OPRM1. We also analyzed the effect of mutations of A6V and N40D on the stability of individual OPRM1/OPRD1 molecules and the OPRD1-OPRM1 heterodimer with the goal of inferring their plausible linkage with opioid addiction. It was found that both mutations slightly destabilize OPRM1/OPRD1 monomers and weaken their association. Since hetero-dimerization is a key step for signaling processes, it is anticipated that both mutations may be causing increased addiction risk.

Subject(s)

Opioid-Related Disorders/genetics , Receptors, Opioid, delta/genetics , Receptors, Opioid, mu/genetics , Receptors, Opioid/genetics , Signal Transduction/genetics , Dimerization , Humans , Mutation/genetics , beta-Arrestins/genetics

9.

SAMPDI-3D: predicting the effects of protein and DNA mutations on protein-DNA interactions.

Li, Gen; Panday, Shailesh Kumar; Peng, Yunhui; Alexov, Emil.

Bioinformatics ; 37(21): 3760-3765, 2021 11 05.

Article in English | MEDLINE | ID: mdl-34343273

ABSTRACT

MOTIVATION: Mutations that alter protein-DNA interactions may be pathogenic and cause diseases. Therefore, it is extremely important to quantify the effect of mutations on protein-DNA binding free energy to reveal the molecular origin of diseases and to assist the development of treatments. Although several methods that predict the change of protein-DNA binding affinity upon mutations in the binding protein were developed, the effect of DNA mutations was not considered yet. RESULTS: Here, we report a new version of SAMPDI, the SAMPDI-3D, which is a gradient boosting decision tree machine learning method to predict the change of the protein-DNA binding free energy caused by mutations in both the binding protein and the bases of the corresponding DNA. The method is shown to achieve Pearson correlation coefficient of 0.76 and 0.80 in a benchmarking test against experimentally determined change of the binding free energy caused by mutations in the binding protein or DNA, respectively. Furthermore, three datasets collected from literature were used to do blind benchmark for SAMPDI-3D and it is shown that it outperforms all existing state-of-the-art methods. The method is very fast allowing for genome-scale investigations. AVAILABILITYAND IMPLEMENTATION: It is available as a web server and a stand-code at http://compbio.clemson.edu/SAMPDI-3D/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Proteins , Software , Proteins/chemistry , Mutation , Protein Binding , DNA/metabolism

10.

Computational Investigation of the pH Dependence of Stability of Melanosome Proteins: Implication for Melanosome formation and Disease.

Koirala, Mahesh; Shashikala, H B Mihiri; Jeffries, Jacob; Wu, Bohua; Loftus, Stacie K; Zippin, Jonathan H; Alexov, Emil.

Int J Mol Sci ; 22(15)2021 Jul 31.

Article in English | MEDLINE | ID: mdl-34361043

ABSTRACT

Intravesicular pH plays a crucial role in melanosome maturation and function. Melanosomal pH changes during maturation from very acidic in the early stages to neutral in late stages. Neutral pH is critical for providing optimal conditions for the rate-limiting, pH-sensitive melanin-synthesizing enzyme tyrosinase (TYR). This dramatic change in pH is thought to result from the activity of several proteins that control melanosomal pH. Here, we computationally investigated the pH-dependent stability of several melanosomal membrane proteins and compared them to the pH dependence of the stability of TYR. We confirmed that the pH optimum of TYR is neutral, and we also found that proteins that are negative regulators of melanosomal pH are predicted to function optimally at neutral pH. In contrast, positive pH regulators were predicted to have an acidic pH optimum. We propose a competitive mechanism among positive and negative regulators that results in pH equilibrium. Our findings are consistent with previous work that demonstrated a correlation between the pH optima of stability and activity, and they are consistent with the expected activity of positive and negative regulators of melanosomal pH. Furthermore, our data suggest that disease-causing variants impact the pH dependence of melanosomal proteins; this is particularly prominent for the OCA2 protein. In conclusion, melanosomal pH appears to affect the activity of multiple melanosomal proteins.

Subject(s)

Antigens, Neoplasm/chemistry , Copper-Transporting ATPases/chemistry , Melanosomes/metabolism , Membrane Transport Proteins/chemistry , Molecular Dynamics Simulation , Monophenol Monooxygenase/chemistry , Protons , Antigens, Neoplasm/metabolism , Copper-Transporting ATPases/metabolism , Humans , Hydrogen-Ion Concentration , Melanosomes/chemistry , Membrane Transport Proteins/metabolism , Monophenol Monooxygenase/metabolism , Protein Stability

11.

Increased p53 signaling impairs neural differentiation in HUWE1-promoted intellectual disabilities.

Aprigliano, Rossana; Aksu, Merdane Ezgi; Bradamante, Stefano; Mihaljevic, Boris; Wang, Wei; Rian, Kristin; Montaldo, Nicola P; Grooms, Kayla Mae; Fordyce Martin, Sarah L; Bordin, Diana L; Bosshard, Matthias; Peng, Yunhui; Alexov, Emil; Skinner, Cindy; Liabakk, Nina-Beate; Sullivan, Gareth J; Bjørås, Magnar; Schwartz, Charles E; van Loon, Barbara.

Cell Rep Med ; 2(4): 100240, 2021 04 20.

Article in English | MEDLINE | ID: mdl-33948573

ABSTRACT

Essential E3 ubiquitin ligase HUWE1 (HECT, UBA, and WWE domain containing 1) regulates key factors, such as p53. Although mutations in HUWE1 cause heterogenous neurodevelopmental X-linked intellectual disabilities (XLIDs), the disease mechanisms common to these syndromes remain unknown. In this work, we identify p53 signaling as the central process altered in HUWE1-promoted XLID syndromes. By focusing on Juberg-Marsidi syndrome (JMS), one of the severest XLIDs, we show that increased p53 signaling results from p53 accumulation caused by HUWE1 p.G4310R destabilization. This further alters cell-cycle progression and proliferation in JMS cells. Modeling of JMS neurodevelopment reveals majorly impaired neural differentiation accompanied by increased p53 signaling. The neural differentiation defects can be successfully rescued by reducing p53 levels and restoring the expression of p53 target genes, in particular CDKN1A/p21. In summary, our findings suggest that increased p53 signaling underlies HUWE1-promoted syndromes and impairs XLID JMS neural differentiation.

Subject(s)

Cell Differentiation/genetics , Intellectual Disability/genetics , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Proteins/genetics , Ubiquitin-Protein Ligases/genetics , Cell Differentiation/physiology , Genes, X-Linked/genetics , Humans , Mutation/genetics

12.

On regularization of charge singularities in solving the Poisson-Boltzmann equation with a smooth solute-solvent boundary.

Wang, Siwen; Alexov, Emil; Zhao, Shan.

Math Biosci Eng ; 18(2): 1370-1405, 2021 01 21.

Article in English | MEDLINE | ID: mdl-33757190

ABSTRACT

Numerical treatment of singular charges is a grand challenge in solving the Poisson-Boltzmann (PB) equation for analyzing electrostatic interactions between the solute biomolecules and the surrounding solvent with ions. For diffuse interface PB models in which solute and solvent are separated by a smooth boundary, no effective algorithm for singular charges has been developed, because the fundamental solution with a space dependent dielectric function is intractable. In this work, a novel regularization formulation is proposed to capture the singularity analytically, which is the first of its kind for diffuse interface PB models. The success lies in a dual decomposition - besides decomposing the potential into Coulomb and reaction field components, the dielectric function is also split into a constant base plus space changing part. Using the constant dielectric base, the Coulomb potential is represented analytically via Green's functions. After removing the singularity, the reaction field potential satisfies a regularized PB equation with a smooth source. To validate the proposed regularization, a Gaussian convolution surface (GCS) is also introduced, which efficiently generates a diffuse interface for three-dimensional realistic biomolecules. The performance of the proposed regularization is examined by considering both analytical and GCS diffuse interfaces, and compared with the trilinear method. Moreover, the proposed GCS-regularization algorithm is validated by calculating electrostatic free energies for a set of proteins and by estimating salt affinities for seven protein complexes. The results are consistent with experimental data and estimates of sharp interface PB models.

Subject(s)

Algorithms , Proteins , Entropy , Solvents , Static Electricity

13.

SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability.

Li, Gen; Panday, Shailesh Kumar; Alexov, Emil.

Int J Mol Sci ; 22(2)2021 Jan 09.

Article in English | MEDLINE | ID: mdl-33435356

ABSTRACT

Modeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learning method to predict the change of the folding free energy caused by amino acid substitutions. The method does not require the 3D structure of the corresponding protein, but only its sequence and, thus, can be applied on genome-scale investigations where structural information is very sparse. SAAFEC-SEQ uses physicochemical properties, sequence features, and evolutionary information features to make the predictions. It is shown to consistently outperform all existing state-of-the-art sequence-based methods in both the Pearson correlation coefficient and root-mean-squared-error parameters as benchmarked on several independent datasets. The SAAFEC-SEQ has been implemented into a web server and is available as stand-alone code that can be downloaded and embedded into other researchers' code.

Subject(s)

Protein Stability , Proteins/chemistry , Amino Acid Substitution , Humans , Machine Learning , Point Mutation , Proteins/genetics , Software , Thermodynamics

14.

SAAMBE-SEQ: a sequence-based method for predicting mutation effect on protein-protein binding affinity.

Li, Gen; Pahari, Swagata; Murthy, Adithya Krishna; Liang, Siqi; Fragoza, Robert; Yu, Haiyuan; Alexov, Emil.

Bioinformatics ; 37(7): 992-999, 2021 05 17.

Article in English | MEDLINE | ID: mdl-32866236

ABSTRACT

MOTIVATION: Vast majority of human genetic disorders are associated with mutations that affect protein-protein interactions by altering wild-type binding affinity. Therefore, it is extremely important to assess the effect of mutations on protein-protein binding free energy to assist the development of therapeutic solutions. Currently, the most popular approaches use structural information to deliver the predictions, which precludes them to be applicable on genome-scale investigations. Indeed, with the progress of genomic sequencing, researchers are frequently dealing with assessing effect of mutations for which there is no structure available. RESULTS: Here, we report a Gradient Boosting Decision Tree machine learning algorithm, the SAAMBE-SEQ, which is completely sequence-based and does not require structural information at all. SAAMBE-SEQ utilizes 80 features representing evolutionary information, sequence-based features and change of physical properties upon mutation at the mutation site. The approach is shown to achieve Pearson correlation coefficient (PCC) of 0.83 in 5-fold cross validation in a benchmarking test against experimentally determined binding free energy change (ΔΔG). Further, a blind test (no-STRUC) is compiled collecting experimental ΔΔG upon mutation for protein complexes for which structure is not available and used to benchmark SAAMBE-SEQ resulting in PCC in the range of 0.37-0.46. The accuracy of SAAMBE-SEQ method is found to be either better or comparable to most advanced structure-based methods. SAAMBE-SEQ is very fast, available as webserver and stand-alone code, and indeed utilizes only sequence information, and thus it is applicable for genome-scale investigations to study the effect of mutations on protein-protein interactions. AVAILABILITY AND IMPLEMENTATION: SAAMBE-SEQ is available at http://compbio.clemson.edu/saambe_webserver/indexSEQ.php#started. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Proteins , Software , Algorithms , Humans , Mutation , Protein Binding , Proteins/genetics

15.

A Newton-like iterative method implemented in the DelPhi for solving the nonlinear Poisson-Boltzmann equation.

Li, Chuan; McGowan, Mark; Alexov, Emil; Zhao, Shan.

Math Biosci Eng ; 17(6): 6259-6277, 2020 09 21.

Article in English | MEDLINE | ID: mdl-33378855

ABSTRACT

DelPhi is a popular scientific program which numerically solves the Poisson-Boltzmann equation (PBE) for electrostatic potentials and energies of biomolecules immersed in water via finite difference method. It is well known for its accuracy, reliability, flexibility, and efficiency. In this work, a new edition of DelPhi that uses a novel Newton-like method to solve the nonlinear PBE, in addition to the already implemented Successive Over Relaxation (SOR) algorithm, is introduced. Our tests on various examples have shown that this new method is superior to the SOR method in terms of stability when solving the nonlinear PBE, being able to converge even for problems involving very strong nonlinearity.

Subject(s)

Algorithms , Reproducibility of Results , Static Electricity

16.

An Ensemble Approach to Predict the Pathogenicity of Synonymous Variants.

Ranganathan Ganakammal, Satishkumar; Alexov, Emil.

Genes (Basel) ; 11(9)2020 09 21.

Article in English | MEDLINE | ID: mdl-32967157

ABSTRACT

Single-nucleotide variants (SNVs) are a major form of genetic variation in the human genome that contribute to various disorders. There are two types of SNVs, namely non-synonymous (missense) variants (nsSNVs) and synonymous variants (sSNVs), predominantly involved in RNA processing or gene regulation. sSNVs, unlike missense or nsSNVs, do not alter the amino acid sequences, thereby making challenging candidates for downstream functional studies. Numerous computational methods have been developed to evaluate the clinical impact of nsSNVs, but very few methods are available for understanding the effects of sSNVs. For this analysis, we have downloaded sSNVs from the ClinVar database with various features such as conservation, DNA-RNA, and splicing properties. We performed feature selection and implemented an ensemble random forest (RF) classification algorithm to build a classifier to predict the pathogenicity of the sSNVs. We demonstrate that the ensemble predictor with selected features (20 features) enhances the classification of sSNVs into two categories, pathogenic and benign, with high accuracy (87%), precision (79%), and recall (91%). Furthermore, we used this prediction model to reclassify sSNVs with unknown clinical significance. Finally, the method is very robust and can be used to predict the effect of other unknown sSNVs.

Subject(s)

Polymorphism, Single Nucleotide/genetics , RNA Splicing/genetics , Virulence/genetics , Algorithms , Genome, Human/genetics , Humans

17.

Mutations in FAM50A suggest that Armfield XLID syndrome is a spliceosomopathy.

Lee, Yu-Ri; Khan, Kamal; Armfield-Uhas, Kim; Srikanth, Sujata; Thompson, Nicola A; Pardo, Mercedes; Yu, Lu; Norris, Joy W; Peng, Yunhui; Gripp, Karen W; Aleck, Kirk A; Li, Chumei; Spence, Ed; Choi, Tae-Ik; Kwon, Soo Jeong; Park, Hee-Moon; Yu, Daseuli; Heo, Won Do; Mooney, Marie R; Baig, Shahid M; Wentzensen, Ingrid M; Telegrafi, Aida; McWalter, Kirsty; Moreland, Trevor; Roadhouse, Chelsea; Ramsey, Keri; Lyons, Michael J; Skinner, Cindy; Alexov, Emil; Katsanis, Nicholas; Stevenson, Roger E; Choudhary, Jyoti S; Adams, David J; Kim, Cheol-Hee; Davis, Erica E; Schwartz, Charles E.

Nat Commun ; 11(1): 3698, 2020 07 23.

Article in English | MEDLINE | ID: mdl-32703943

ABSTRACT

Intellectual disability (ID) is a heterogeneous clinical entity and includes an excess of males who harbor variants on the X-chromosome (XLID). We report rare FAM50A missense variants in the original Armfield XLID syndrome family localized in Xq28 and four additional unrelated males with overlapping features. Our fam50a knockout (KO) zebrafish model exhibits abnormal neurogenesis and craniofacial patterning, and in vivo complementation assays indicate that the patient-derived variants are hypomorphic. RNA sequencing analysis from fam50a KO zebrafish show dysregulation of the transcriptome, with augmented spliceosome mRNAs and depletion of transcripts involved in neurodevelopment. Zebrafish RNA-seq datasets show a preponderance of 3' alternative splicing events in fam50a KO, suggesting a role in the spliceosome C complex. These data are supported with transcriptomic signatures from cell lines derived from affected individuals and FAM50A protein-protein interaction data. In sum, Armfield XLID syndrome is a spliceosomopathy associated with aberrant mRNA processing during development.

Subject(s)

DNA-Binding Proteins/genetics , Intellectual Disability/genetics , Mental Retardation, X-Linked/genetics , Mutation/genetics , RNA-Binding Proteins/genetics , Spliceosomes/metabolism , Zebrafish Proteins/genetics , Adult , Animals , Cell Nucleus/metabolism , Child , Child, Preschool , DNA-Binding Proteins/metabolism , Family , Female , Gene Expression Regulation, Developmental , Humans , Male , Mice , Mutation, Missense/genetics , NIH 3T3 Cells , Pedigree , Phenotype , Protein Transport , RNA Splicing/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Small Nuclear/genetics , RNA-Binding Proteins/metabolism , Syndrome , Zebrafish/genetics , Zebrafish Proteins/metabolism

18.

SAAMBE-3D: Predicting Effect of Mutations on Protein-Protein Interactions.

Pahari, Swagata; Li, Gen; Murthy, Adithya Krishna; Liang, Siqi; Fragoza, Robert; Yu, Haiyuan; Alexov, Emil.

Int J Mol Sci ; 21(7)2020 Apr 07.

Article in English | MEDLINE | ID: mdl-32272725

ABSTRACT

Maintaining wild type protein-protein interactions is essential for the normal function of cell and any mutation that alter their characteristics can cause disease. Therefore, the ability to correctly and quickly predict the effect of amino acid mutations is crucial for understanding disease effects and to be able to carry out genome-wide studies. Here, we report a new development of the SAAMBE method, SAAMBE-3D, which is a machine learning-based approach, resulting in accurate predictions and is extremely fast. It achieves the Pearson correlation coefficient ranging from 0.78 to 0.82 depending on the training protocol in benchmarking five-fold validation test against the SKEMPI v2.0 database and outperforms currently existing algorithms on various blind-tests. Furthermore, optimized and tested via five-fold cross-validation on the Cornell University dataset, the SAAMBE-3D achieves AUC of 1.0 and 0.96 on a homo and hereto-dimer test datasets. Another important feature of SAAMBE-3D is that it is very fast, it takes less than a fraction of a second to complete a prediction. SAAMBE-3D is available as a web server and as well as a stand-alone code, the last one being another important feature allowing other researchers to directly download the code and run it on their local computer. Combined all together, SAAMBE-3D is an accurate and fast software applicable for genome-wide studies to assess the effect of amino acid mutations on protein-protein interactions. The webserver and the stand-alone codes (SAAMBE-3D for predicting the change of binding free energy and SAAMBE-3D-DN for predicting if the mutation is disruptive or non-disruptive) are available.

Subject(s)

Mutation/genetics , Protein Interaction Maps/genetics , Proteins/genetics , Algorithms , Amino Acids/genetics , Genome-Wide Association Study/methods , Humans , Machine Learning , Protein Binding/genetics , Software

19.

Capturing the Effects of Explicit Waters in Implicit Electrostatics Modeling: Qualitative Justification of Gaussian-Based Dielectric Models in DelPhi.

Chakravorty, Arghya; Panday, Shailesh; Pahari, Swagata; Zhao, Shan; Alexov, Emil.

J Chem Inf Model ; 60(4): 2229-2246, 2020 04 27.

Article in English | MEDLINE | ID: mdl-32155062

ABSTRACT

Our group has implemented a smooth Gaussian-based dielectric function in DelPhi (J. Chem. Theory Comput. 2013, 9 (4), 2126-2136) which models the solute as an object with inhomogeneous dielectric permittivity and provides a smooth transition of dielectric permittivity from surface-bound water to bulk solvent. Although it is well-understood that the protein hydrophobic core is less polarizable than the hydrophilic protein surface, less attention is paid to the polarizability of water molecules inside the solute and on its surface. Here, we apply explicit water simulations to study the behavior of water molecules buried inside a protein and on the surface of that protein and contrast it with the behavior of the bulk water. We selected a protein that is experimentally shown to have five cavities, most of which are occupied by water molecules. We demonstrate through molecular dynamics (MD) simulations that the behavior of water in the cavity is drastically different from that in the bulk. These observations were made by comparing the mean residence times, dipole orientation relaxation times, and average dipole moment fluctuations. We also show that the bulk region has a nonuniform distribution of these tempo-spatial properties. From the perspective of continuum electrostatics, we argue that the dielectric "constant" in water-filled cavities of proteins and the space close to the molecular surface should differ from that assigned to the bulk water. This provides support for the Gaussian-based smooth dielectric model for solving electrostatics in the Poisson-Boltzmann equation framework. Furthermore, we demonstrate that using a well-parametrized Gaussian-based model with a single energy-minimized configuration of a protein can also reproduce its ensemble-averaged polar solvation energy. Thus, we argue that the Gaussian-based smooth dielectric model not only captures accurate physics but also provides an efficient way of computing ensemble-averaged quantities.

Subject(s)

Proteins , Static Electricity , Normal Distribution , Solutions , Solvents

20.

BION-2: Predicting Positions of Non-Specifically Bound Ions on Protein Surface by a Gaussian-Based Treatment of Electrostatics.

Shashikala, H B Mihiri; Chakravorty, Arghya; Panday, Shailesh Kumar; Alexov, Emil.

Int J Mol Sci ; 22(1)2020 Dec 29.

Article in English | MEDLINE | ID: mdl-33383946

ABSTRACT

Ions play significant roles in biological processes-they may specifically bind to a protein site or bind non-specifically on its surface. Although the role of specifically bound ions ranges from actively providing structural compactness via coordination of charge-charge interactions to numerous enzymatic activities, non-specifically surface-bound ions are also crucial to maintaining a protein's stability, responding to pH and ion concentration changes, and contributing to other biological processes. However, the experimental determination of the positions of non-specifically bound ions is not trivial, since they may have a low residential time and experience significant thermal fluctuation of their positions. Here, we report a new release of a computational method, the BION-2 method, that predicts the positions of non-specifically surface-bound ions. The BION-2 utilizes the Gaussian-based treatment of ions within the framework of the modified Poisson-Boltzmann equation, which does not require a sharp boundary between the protein and water phase. Thus, the predictions are done by the balance of the energy of interaction between the protein charges and the corresponding ions and the de-solvation penalty of the ions as they approach the protein. The BION-2 is tested against experimentally determined ion's positions and it is demonstrated that it outperforms the old BION and other available tools.

Subject(s)

Biophysical Phenomena , Ions/chemistry , Models, Theoretical , Proteins/chemistry , Static Electricity , Algorithms , Models, Molecular , Protein Conformation , Structure-Activity Relationship

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL