Pesquisa | Portal Regional da BVS

1.

Rapid discrimination between deleterious and benign missense mutations in the CAGI 6 experiment.

Faraggi, Eshel; Jernigan, Robert L; Kloczkowski, Andrzej.

Hum Genomics ; 18(1): 89, 2024 Aug 27.

Artigo em Inglês | MEDLINE | ID: mdl-39192324

RESUMO

We describe the machine learning tool that we applied in the CAGI 6 experiment to predict whether single residue mutations in proteins are deleterious or benign. This tool was trained using only single sequences, i.e., without multiple sequence alignments or structural information. Instead, we used global characterizations of the protein sequence. Training and testing data for human gene mutations was obtained from ClinVar (ncbi.nlm.nih.gov/pub/ClinVar/), and for non-human gene mutations from Uniprot (www.uniprot.org). Testing was done on post-training data from ClinVar. This testing yielded high AUC and Matthews correlation coefficient (MCC) for well trained examples but low generalizability. For genes with either sparse or unbalanced training data, the prediction accuracy is poor. The resulting prediction server is available online at http://www.mamiris.com/Shoni.cagi6.

Assuntos

Aprendizado de Máquina , Mutação de Sentido Incorreto , Humanos , Mutação de Sentido Incorreto/genética , Software , Biologia Computacional/métodos , Proteínas/genética

2.

Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A.

Jain, Shantanu; Trinidad, Marena; Nguyen, Thanh Binh; Jones, Kaiya; Neto, Santiago Diaz; Ge, Fang; Glagovsky, Ailin; Jones, Cameron; Moran, Giankaleb; Wang, Boqi; Rahimi, Kobra; Çalici, Sümeyra Zeynep; Cedillo, Luis R; Berardelli, Silvia; Özden, Buse; Chen, Ken; Katsonis, Panagiotis; Williams, Amanda; Lichtarge, Olivier; Rana, Sadhna; Pradhan, Swatantra; Srinivasan, Rajgopal; Sajeed, Rakshanda; Joshi, Dinesh; Faraggi, Eshel; Jernigan, Robert; Kloczkowski, Andrzej; Xu, Jierui; Song, Zigang; Özkan, Selen; Padilla, Natàlia; de la Cruz, Xavier; Acuna-Hidalgo, Rocio; Grafmüller, Andrea; Jiménez Barrón, Laura T; Manfredi, Matteo; Savojardo, Castrense; Babbi, Giulia; Martelli, Pier Luigi; Casadio, Rita; Sun, Yuanfei; Zhu, Shaowen; Shen, Yang; Pucci, Fabrizio; Rooman, Marianne; Cia, Gabriel; Raimondi, Daniele; Hermans, Pauline; Kwee, Sofia; Chen, Ella.

bioRxiv ; 2024 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-38798479

RESUMO

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.

3.

Tubulin CFEOM mutations both inhibit or activate kinesin motor activity.

Luchniak, Anna; Roy, Pallavi Sinha; Kumar, Ambuj; Schneider, Ian C; Gelfand, Vladimir I; Jernigan, Robert L; Gupta, Mohan L.

Mol Biol Cell ; 35(3): ar32, 2024 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-38170592

RESUMO

Kinesin-mediated transport along microtubules is critical for axon development and health. Mutations in the kinesin Kif21a, or the microtubule subunit ß-tubulin, inhibit axon growth and/or maintenance resulting in the eye-movement disorder congenital fibrosis of the extraocular muscles (CFEOM). While most examined CFEOM-causing ß-tubulin mutations inhibit kinesin-microtubule interactions, Kif21a mutations activate the motor protein. These contrasting observations have led to opposed models of inhibited or hyperactive Kif21a in CFEOM. We show that, contrary to other CFEOM-causing ß-tubulin mutations, R380C enhances kinesin activity. Expression of ß-tubulin-R380C increases kinesin-mediated peroxisome transport in S2 cells. The binding frequency, percent motile engagements, run length and plus-end dwell time of Kif21a are also elevated on ß-tubulin-R380C compared with wildtype microtubules in vitro. This conserved effect persists across tubulins from multiple species and kinesins from different families. The enhanced activity is independent of tail-mediated kinesin autoinhibition and thus utilizes a mechanism distinct from CFEOM-causing Kif21a mutations. Using molecular dynamics, we visualize how ß-tubulin-R380C allosterically alters critical structural elements within the kinesin motor domain, suggesting a basis for the enhanced motility. These findings resolve the disparate models and confirm that inhibited or increased kinesin activity can both contribute to CFEOM. They also demonstrate the microtubule's role in regulating kinesins and highlight the importance of balanced transport for cellular and organismal health.

Assuntos

Oftalmoplegia , Tubulina (Proteína) , Humanos , Tubulina (Proteína)/metabolismo , Cinesinas/metabolismo , Oftalmoplegia/genética , Oftalmoplegia/metabolismo , Mutação/genética , Microtúbulos/metabolismo , Atividade Motora

4.

New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions.

Jia, Kejue; Kilinc, Mesih; Jernigan, Robert L.

Front Bioinform ; 3: 1227193, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37900964

RESUMO

Understanding protein sequences and how they relate to the functions of proteins is extremely important. One of the most basic operations in bioinformatics is sequence alignment and usually the first things learned from these are which positions are the most conserved and often these are critical parts of the structure, such as enzyme active site residues. In addition, the contact pairs in a protein usually correspond closely to the correlations between residue positions in the multiple sequence alignment, and these usually change in a systematic and coordinated way, if one position changes then the other member of the pair also changes to compensate. In the present work, these correlated pairs are taken as anchor points for a new type of sequence alignment. The main advantage of the method here is its combining the remote homolog detection from our method PROST with pairwise sequence substitutions in the rigorous method from Kleinjung et al. We show a few examples of some resulting sequence alignments, and how they can lead to improvements in alignments for function, even for a disordered protein.

5.

JSONWP: a static website generator for protein bioinformatics research.

Kilinc, Mesih; Jia, Kejue; Jernigan, Robert L.

Bioinform Adv ; 3(1): vbad154, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37904893

RESUMO

Motivation: Presenting the integrated results of bioinformatics research can be challenging and requires sophisticated visualization components, which can be time-consuming to develop. This article presents a new way to effectively communicate research findings. Results: We have developed a static web page generator, JSONWP, which is specifically designed for protein bioinformatics research. Utilizing React (a JavaScript library used to build interactive and dynamic user interfaces for web applications), we have integrated publicly available bioinformatics visualization components to provide standardized access to these components. JSON (or JavaScript Object Notation, is a lightweight textual data format often used to structure and exchange information between different software tools.) is used as the input source due to its ability to represent nearly all types of data using key and value pairs. This allows researchers to use their preferred programming language to create a JSON representation, which can then be converted into a website by JSONWP. No server or domain is required to host the website, as only the publicly accessible JSON file is required. Conclusions: Overall, JSONWP provides a useful new tool for bioinformatics researchers to effectively communicate their findings. The open-source implementation is located at https://github.com/MesihK/react-json-wpbuilder, and the tool can be used at jsonwp.onrender.com.

6.

Characterizing interactions in E-cadherin assemblages.

Shome, Sayane; Jia, Kejue; Sivasankar, Sanjeevi; Jernigan, Robert L.

Biophys J ; 122(15): 3069-3077, 2023 08 08.

Artigo em Inglês | MEDLINE | ID: mdl-37345249

RESUMO

Cadherin intermolecular interactions are critical for cell-cell adhesion and play essential roles in tissue formation and the maintenance of tissue structures. In this study, we focus on E-cadherin, a classical cadherin that connects epithelial cells, to understand how they interact in cis and trans conformations when attached to the same cell or opposing cells. We employ coevolutionary sequence analysis and molecular dynamics simulations to confirm previously known interaction sites as well as to identify new interaction sites. The sequence coevolutionary results yield a surprising result indicating that there are no strongly favored intermolecular interaction sites, which is unusual and suggests that many interaction sites may be possible, with none being strongly preferred over others. By using molecular dynamics, we test the persistence of these interactions and how they facilitate adhesion. We build several types of cadherin assemblages, with different numbers and combinations of cis and trans interfaces to understand how these conformations act to facilitate adhesion. Our results suggest that, in addition to the established interaction sites on the EC1 and EC2 domains, an additional plausible cis interface at the EC3-EC5 domain exists. Furthermore, we identify specific mutations at cis/trans binding sites that impair adhesion within E-cadherin assemblages.

Assuntos

Caderinas , Sítios de Ligação , Caderinas/química , Caderinas/metabolismo , Adesão Celular , Mutação , Ligação Proteica , Animais , Camundongos

7.

Predicting allosteric pockets in protein biological assemblages.

Kumar, Ambuj; Kaynak, Burak T; Dorman, Karin S; Doruker, Pemra; Jernigan, Robert L.

Bioinformatics ; 39(5)2023 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-37115636

RESUMO

MOTIVATION: Allostery enables changes to the dynamic behavior of a protein at distant positions induced by binding. Here, we present APOP, a new allosteric pocket prediction method, which perturbs the pockets formed in the structure by stiffening pairwise interactions in the elastic network across the pocket, to emulate ligand binding. Ranking the pockets based on the shifts in the global mode frequencies, as well as their mean local hydrophobicities, leads to high prediction success when tested on a dataset of allosteric proteins, composed of both monomers and multimeric assemblages. RESULTS: Out of the 104 test cases, APOP predicts known allosteric pockets for 92 within the top 3 rank out of multiple pockets available in the protein. In addition, we demonstrate that APOP can also find new alternative allosteric pockets in proteins. Particularly interesting findings are the discovery of previously overlooked large pockets located in the centers of many protein biological assemblages; binding of ligands at these sites would likely be particularly effective in changing the protein's global dynamics. AVAILABILITY AND IMPLEMENTATION: APOP is freely available as an open-source code (https://github.com/Ambuj-UF/APOP) and as a web server at https://apop.bb.iastate.edu/.

Assuntos

Proteínas , Software , Proteínas/química , Ligantes , Ligação Proteica , Sítios de Ligação , Conformação Proteica , Sítio Alostérico

8.

Improved global protein homolog detection with major gains in function identification.

Kilinc, Mesih; Jia, Kejue; Jernigan, Robert L.

Proc Natl Acad Sci U S A ; 120(9): e2211823120, 2023 02 28.

Artigo em Inglês | MEDLINE | ID: mdl-36827259

RESUMO

There are several hundred million protein sequences, but the relationships among them are not fully available from existing homolog detection methods. There is an essential need for an improved method to push homolog detection to lower levels of sequence identity. The method used here relies on a language model to represent proteins numerically in a matrix (an embedding) and uses discrete cosine transforms to compress the data to extract the most essential part, significantly reducing the data size. This PRotein Ortholog Search Tool (PROST) is significantly faster with linear runtimes, and most importantly, computes the distances between pairs of protein sequences to yield homologs at significantly lower levels of sequence identity than previously. The extent of allosteric effects in proteins points out the importance of global aspects of structure and sequence. PROST excels at global homology detection but not at detecting local homologs. Results are validated by strong similarities between the corresponding pairs of structures. The number of remote homologs detected increased significantly and pushes the effective sequence matches more deeply into the twilight zone. Human protein sequences presently having no assigned function now find significant numbers of putative homologs for 93% of cases and structurally verified assigned functions for 76.4% of these cases. The data compression enables massive searches for homologs with short search times while yielding significant gains in the numbers of remote homologs detected. The method is sufficiently efficient to permit whole-genome/proteome comparisons. The PROST web server is accessible at https://mesihk.github.io/prost.

Assuntos

Compressão de Dados , Proteoma , Humanos , Sequência de Aminoácidos , Ferramenta de Busca , Genoma , Bases de Dados de Proteínas

9.

Functional Protein Dynamics Directly from Sequences.

Jia, Kejue; Kilinc, Mesih; Jernigan, Robert L.

J Phys Chem B ; 127(9): 1914-1921, 2023 03 09.

Artigo em Inglês | MEDLINE | ID: mdl-36848294

RESUMO

The sequence correlations within a protein multiple sequence alignment are routinely being used to predict contacts within its structure, but here we point out that these data can also be used to predict a protein's dynamics directly. The elastic network protein dynamics models rely directly upon the contacts, and the normal modes of motion are obtained from the decomposition of the inverse of the contact map. To make the direct connection between sequence and dynamics, it is necessary to apply coarse-graining to the structure at the level of one point per amino acid, which has often been done, and protein coarse-grained dynamics from elastic network models has been highly successful, particularly in representing the large-scale motions of proteins that usually relate closely to their functions. The interesting implication of this is that it is not necessary to know the structure itself to obtain its dynamics and instead to use the sequence information directly to obtain the dynamics.

Assuntos

Aminoácidos , Proteínas , Conformação Proteica , Modelos Moleculares , Proteínas/química , Movimento (Física)

10.

Opinion: Protein folds vs. protein folding: Differing questions, different challenges.

Chen, Shi-Jie; Hassan, Mubashir; Jernigan, Robert L; Jia, Kejue; Kihara, Daisuke; Kloczkowski, Andrzej; Kotelnikov, Sergei; Kozakov, Dima; Liang, Jie; Liwo, Adam; Matysiak, Silvina; Meller, Jarek; Micheletti, Cristian; Mitchell, Julie C; Mondal, Sayantan; Nussinov, Ruth; Okazaki, Kei-Ichi; Padhorny, Dzmitry; Skolnick, Jeffrey; Sosnick, Tobin S; Stan, George; Vakser, Ilya; Zou, Xiaoqin; Rose, George D.

Proc Natl Acad Sci U S A ; 120(1): e2214423119, 2023 01 03.

Artigo em Inglês | MEDLINE | ID: mdl-36580595

Assuntos

Dobramento de Proteína , Proteínas , Proteínas/metabolismo , Termodinâmica

11.

Entropies Derived from the Packing Geometries within a Single Protein Structure.

Khade, Pranav M; Jernigan, Robert L.

ACS Omega ; 7(24): 20719-20730, 2022 Jun 21.

Artigo em Inglês | MEDLINE | ID: mdl-35755337

RESUMO

A fast, simple, yet robust method to calculate protein entropy from a single protein structure is presented here. The focus is on the atomic packing details, which are calculated by combining Voronoi diagrams and Delaunay tessellations. Even though the method is simple, the entropies computed exhibit an extremely high correlation with the entropies previously derived by other methods based on quasi-harmonic motions, quantum mechanics, and molecular dynamics simulations. These packing-based entropies account directly for the local freedom and provide entropy for any individual protein structure that could be used to compute free energies directly during simulations for the generation of more reliable trajectories and also for better evaluations of modeled protein structures. Physico-chemical properties of amino acids are compared with these packing entropies to uncover the relationships with the entropies of different residue types. A public packing entropy web server is provided at packing-entropy.bb.iastate.edu, and the application programing interface is available within the PACKMAN (https://github.com/Pranavkhade/PACKMAN) package.

12.

Coarse-graining protein structures into their dynamic communities with DCI, a dynamic community identifier.

Kumar, Ambuj; Khade, Pranav M; Dorman, Karin S; Jernigan, Robert L.

Bioinformatics ; 38(10): 2727-2733, 2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35561187

RESUMO

SUMMARY: A new dynamic community identifier (DCI) is presented that relies upon protein residue dynamic cross-correlations generated by Gaussian elastic network models to identify those residue clusters exhibiting motions within a protein. A number of examples of communities are shown for diverse proteins, including GPCRs. It is a tool that can immediately simplify and clarify the most essential functional moving parts of any given protein. Proteins usually can be subdivided into groups of residues that move as communities. These are usually densely packed local sub-structures, but in some cases can be physically distant residues identified to be within the same community. The set of these communities for each protein are the moving parts. The ways in which these are organized overall can aid in understanding many aspects of functional dynamics and allostery. DCI enables a more direct understanding of functions including enzyme activity, action across membranes and changes in the community structure from mutations or ligand binding. The DCI server is freely available on a web site (https://dci.bb.iastate.edu/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Proteínas de Grãos , Movimento (Física) , Distribuição Normal , Conformação Proteica , Proteínas/química

13.

Using Surface Hydrophobicity Together with Empirical Potentials to Identify Protein-Protein Binding Sites: Application to the Interactions of E-cadherins.

Jernigan, Robert L; Khade, Pranav; Kumar, Ambuj; Kloczkowski, Andrzej.

Methods Mol Biol ; 2340: 41-50, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35167069

RESUMO

Studying the interactions within protein structures can inform about the details of how proteins of various types interact and aggregate. Empirical contact potentials have proven to be extremely important in the evaluation of individual modeled protein structures, but have found few applications to protein-protein interactions. In part, this is caused by a lack of properly formulated potentials with a proper reference state. Since the comparisons are made between different bound structures, the proper reference state should take into account other contacts. Therefore, a preferred reference state should be defined with respect to a given residue type interacting with an average residue instead of interacting with solvent as typically is used in derivation of statistical contact potentials. Here, a two-stage procedure for generating and evaluating interacting protein pairs is described, and an example of E-cadherin interactions is shown.

Assuntos

Caderinas , Sítios de Ligação , Caderinas/metabolismo , Interações Hidrofóbicas e Hidrofílicas , Ligação Proteica , Solventes

14.

PACKMAN-Molecule: Python toolbox for structural bioinformatics.

Khade, Pranav M; Jernigan, Robert L.

Bioinform Adv ; 2(1): vbac007, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36699371

RESUMO

PACKMAN-molecule is a Structural Bioinformatics toolbox in the form of an Application Programming Interface that contains several utilities that can be used for structural bioinformatics applications. It has already been used in several applications, and its added features and unique object hierarchy make it readily extensible, feature-rich and user-friendly. The tutorial for it is available at: https://py-packman.readthedocs.io/en/latest/tutorials/molecule.html. Availability and implementation: PACKMAN-Molecule is freely available with an MIT license on GitHub at https://github.com/Pranavkhade/PACKMAN.

15.

Simulated Drug Efflux for the AbgT Family of Membrane Transporters.

Shome, Sayane; Sankar, Kannan; Jernigan, Robert L.

J Chem Inf Model ; 61(11): 5673-5681, 2021 11 22.

Artigo em Inglês | MEDLINE | ID: mdl-34714659

RESUMO

Drug extrusion through molecular efflux pumps is an important mechanism for the survival of many pathogenic bacteria by removing drugs, providing multidrug resistance (MDR). Understanding molecular mechanisms for drug extrusion in multidrug efflux pumps is important for the development of new antiresistance drugs. The AbgT family of transporters involved in the folic acid biosynthesis pathway represents one such important efflux pump system. In addition to the transport of the folic acid precursor p-amino benzoic acid (PABA), members of this family are involved in the efflux of several sulfa drugs, conferring drug resistance to the bacteria. With the availability of structures for two members of this family (YdaH and MtrF), we investigate molecular pathways for transport of PABA and a sulfa drug (sulfamethazine) particularly for the YdaH transporter using steered molecular dynamics. Our analyses reveal the probable ligand migration pathways through the transporter, which also identifies key residues along the transport pathway. In addition, simulations using both PABA and sulfamethazine show how the protein is able to transport ligands of different shapes and sizes out of the pathogen. Our observations confirm previously reported functional residues for transport along the pathways by which YdaH transporters achieve antibiotic resistance to shuttle drugs out of the cells.

Assuntos

Proteínas de Membrana Transportadoras , Preparações Farmacêuticas , Antibacterianos/farmacologia , Bactérias/metabolismo , Proteínas de Bactérias/metabolismo , Resistência a Medicamentos

16.

hdANM: a new comprehensive dynamics model for protein hinges.

Khade, Pranav M; Scaramozzino, Domenico; Kumar, Ambuj; Lacidogna, Giuseppe; Carpinteri, Alberto; Jernigan, Robert L.

Biophys J ; 120(22): 4955-4965, 2021 11 16.

Artigo em Inglês | MEDLINE | ID: mdl-34687719

RESUMO

Hinge motions are essential for many protein functions, and their dynamics are important to understand underlying biological mechanisms. The ways that these motions are represented by various computational methods differ significantly. By focusing on a specific class of motion, we have developed a new hinge-domain anisotropic network model (hdANM) that is based on the prior identification of flexible hinges and rigid domains in the protein structure and the subsequent generation of global hinge motions. This yields a set of motions in which the relative translations and rotations of the rigid domains are modulated and controlled by the deformation of the flexible hinges, leading to a more restricted, specific view of these motions. hdANM is the first model, to our knowledge, that combines information about protein hinges and domains to model the characteristic hinge motions of a protein. The motions predicted with this new elastic network model provide important conceptual advantages for understanding the underlying biological mechanisms. As a matter of fact, the generated hinge movements are found to resemble the expected mechanisms required for the biological functions of diverse proteins. Another advantage of this model is that the domain-level coarse graining makes it significantly more computationally efficient, enabling the generation of hinge motions within even the largest molecular assemblies, such as those from cryo-electron microscopy. hdANM is also comprehensive as it can perform in the same way as the well-known protein dynamics models (anisotropic network model, rotations-translations of blocks, and nonlinear rigid block normal mode analysis), depending on the definition of flexible and rigid parts in the protein structure and on whether the motions are extrapolated in a linear or nonlinear fashion. Furthermore, our results indicate that hdANM produces more realistic motions as compared to the anisotropic network model. hdANM is an open-source software, freely available, and hosted on a user-friendly website.

Assuntos

Algoritmos , Proteínas , Simulação por Computador , Microscopia Crioeletrônica , Modelos Moleculares , Conformação Proteica

17.

Ligand Binding Introduces Significant Allosteric Shifts in the Locations of Protein Fluctuations.

Kumar, Ambuj; Jernigan, Robert L.

Front Mol Biosci ; 8: 733148, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34540902

RESUMO

Allostery is usually considered to be a mechanism for transmission of signals associated with physical or dynamic changes in some part of a protein. Here, we investigate the changes in fluctuations across the protein upon ligand binding based on the fluctuations computed with elastic network models. These results suggest that binding reduces the fluctuations at the binding site but increases fluctuations at remote sites, but not to fully compensating extents. If there were complete conservation of entropy, then only the enthalpies of binding would matter and not the entropies; however this does not appear to be the case. Experimental evidence also suggests that energies and entropies of binding can compensate but that the extent of compensation varies widely from case to case. Our results do however always show transmission of an allosteric signal to distant locations where the fluctuations are increased. These fluctuations could be used to compute entropies to improve evaluations of the thermodynamics of binding. We also show the allosteric relationship between peptide binding in the GroEL trans-ring that leads directly to the release of GroES from the GroEL-GroES cis-ring. This finding provides an example of how calculating these changes to protein dynamics induced by the binding of an allosteric ligand can regulate protein function and mechanism.

18.

Xyloglucan Xylosyltransferase 1 Displays Promiscuity Toward Donor Substrates During in Vitro Reactions.

Ehrlich, Jacqueline J; Weerts, Richard M; Shome, Sayane; Culbertson, Alan T; Honzatko, Richard B; Jernigan, Robert L; Zabotina, Olga A.

Plant Cell Physiol ; 62(12): 1890-1901, 2021 Dec 27.

Artigo em Inglês | MEDLINE | ID: mdl-34265062

RESUMO

Glycosyltransferases (GTs) are a large family of enzymes that add sugars to a broad range of acceptor substrates, including polysaccharides, proteins and lipids, by utilizing a wide variety of donor substrates in the form of activated sugars. Individual GTs have generally been considered to exhibit a high level of substrate specificity, but this has not been thoroughly investigated across the extremely large set of GTs. Here we investigate xyloglucan xylosyltransferase 1 (XXT1), a GT involved in the synthesis of the plant cell wall polysaccharide, xyloglucan. Xyloglucan has a glucan backbone, with initial side chain substitutions exclusively composed of xylose from uridine diphosphate (UDP)-xylose. While this conserved substitution pattern suggests a high substrate specificity for XXT1, our in vitro kinetic studies elucidate a more complex set of behavior. Kinetic studies demonstrate comparable kcat values for reactions with UDP-xylose and UDP-glucose, while reactions with UDP-arabinose and UDP-galactose are over 10-fold slower. Using kcat/KM as a measure of efficiency, UDP-xylose is 8-fold more efficient as a substrate than the next best alternative, UDP-glucose. To the best of our knowledge, we are the first to demonstrate that not all plant XXTs are highly substrate specific and some do show significant promiscuity in their in vitro reactions. Kinetic parameters alone likely do not explain the high substrate selectivity in planta, suggesting that there are additional control mechanisms operating during polysaccharide biosynthesis. Improved understanding of substrate specificity of the GTs will aid in protein engineering, development of diagnostic tools, and understanding of biological systems.

Assuntos

Glucanos/biossíntese , Pentosiltransferases/genética , Proteínas de Plantas/genética , Plantas/enzimologia , Glucanos/genética , Cinética , Pentosiltransferases/metabolismo , Proteínas de Plantas/metabolismo , Plantas/metabolismo , Especificidade por Substrato

19.

New amino acid substitution matrix brings sequence alignments into agreement with structure matches.

Jia, Kejue; Jernigan, Robert L.

Proteins ; 89(6): 671-682, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-33469973

RESUMO

Protein sequence matching presently fails to identify many structures that are highly similar, even when they are known to have the same function. The high packing densities in globular proteins lead to interdependent substitutions, which have not previously been considered for amino acid similarities. At present, sequence matching compares sequences based only upon the similarities of single amino acids, ignoring the fact that in densely packed protein, there are additional conservative substitutions representing exchanges between two interacting amino acids, such as a small-large pair changing to a large-small pair substitutions that are not individually so conservative. Here we show that including information for such pairs of substitutions yields improved sequence matches, and that these yield significant gains in the agreements between sequence alignments and structure matches of the same protein pair. The result shows sequence segments matched where structure segments are aligned. There are gains for all 2002 collected cases where the sequence alignments that were not previously congruent with the structure matches. Our results also demonstrate a significant gain in detecting homology for "twilight zone" protein sequences. The amino acid substitution metrics derived have many other potential applications, for annotations, protein design, mutagenesis design, and empirical potential derivation.

Assuntos

Algoritmos , Substituição de Aminoácidos , Aminoácidos/química , Proteínas/química , Sequência de Aminoácidos , Aminoácidos/metabolismo , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto , Humanos , Modelos Moleculares , Engenharia de Proteínas/métodos , Proteínas/metabolismo , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos

20.

LARGE-SCALE MULTIPLE INFERENCE OF COLLECTIVE DEPENDENCE WITH APPLICATIONS TO PROTEIN FUNCTION.

Jernigan, Robert; Jia, Kejue; Ren, Zhao; Zhou, Wen.

Ann Appl Stat ; 15(2): 902-924, 2021 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-35910493

RESUMO

Measuring the dependence of k ≥ 3 random variables and drawing inference from such higher-order dependences are scientifically important yet challenging. Motivated here by protein coevolution with multivariate categorical features, we consider an information theoretic measure of higher-order dependence. The proposed collective dependence is a symmetrization of differential interaction information which generalizes the mutual information of a pair of random variables. We show that the collective dependence can be easily estimated and facilitates a test on the dependence of k ≥ 3 random variables. Upon carefully exploring the null space of collective dependence, we devise a Classification-Assisted Large scaLe inference procedure to DEtect significant k-COllective DEpendence among d ≥ k random variables, with the false discovery rate controlled. Finite sample performance of our method is examined via simulations. We apply this method to the multiple protein sequence alignment data to study the residue or position coevolution for two protein families, the elongation factor P family and the zinc knuckle family. We identify novel functional triplets of amino acid residues, whose contributions to the protein function are further investigated. These confirm that the collective dependence does yield additional information important for understanding the protein coevolution compared to the pairwise measures.

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA