Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
ACS Biomater Sci Eng ; 7(7): 3156-3165, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34151552

RESUMO

The excellent mechanical strength and toughness of spider silk are well characterized experimentally and understood atomistically using computational simulations. However, little attention has been focused on understanding whether the amino acid sequence of ß-sheet nanocrystals, which is the key to rendering strength to silk fiber, is optimally chosen to mitigate molecular-scale failure mechanisms. To investigate this, we modeled ß-sheet nanocrystals of various representative small/polar/hydrophobic amino acid repeats for determining the sequence motif having superior nanomechanical tensile strength and toughness. The constant velocity pulling of the central ß-strand in the nanocrystal, using steered molecular dynamics, showed that homopolymers of small amino acid (alanine/alanine-glycine) sequence motifs, occurring in natural silk fibroin, have better nanomechanical properties than other modeled structures. Further, we analyzed the hydrogen bond (HB) and ß-strand pull dynamics of modeled nanocrystals to understand the variation in their rupture mechanisms and explore sequence-dependent mitigating factors contributing to their superior mechanical properties. Surprisingly, the enhanced side-chain interactions in homopoly-polar/hydrophobic amino acid models are unable to augment backbone HB cooperativity to increase mechanical strength. Our analyses suggest that nanocrystals of pristine silk sequences most likely achieve superior mechanical strength by optimizing side-chain interaction, packing, and main-chain HB interactions. Thus, this study suggests that the nanocrystal ß-sheet sequence plays a crucial role in determining the nanomechanical properties of silk, and the evolutionary process has optimized it in natural silk. This study provides insight into the molecular design principle of silk with implications in the genetically modified artificial synthesis of silk-like biomaterials.


Assuntos
Fibroínas , Nanopartículas , Sequência de Aminoácidos , Conformação Proteica em Folha beta , Seda
2.
Mol Biosyst ; 10(6): 1469-80, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24668165

RESUMO

Despite recent advances, it is yet not clear how intrinsically disordered regions in proteins recognize their targets without any defined structures. Short linear motifs had been proposed to mediate molecular recognition by disordered regions; however, the underlying structural prerequisite remains elusive. Moreover, the role of short linear motifs in DNA recognition has not been studied. We report a repertoire of short evolutionarily Conserved Recognition Elements (CoREs) in long intrinsically disordered regions, which have very distinct amino-acid propensities from those of known motifs, and exhibit a strong tendency to retain their three-dimensional conformations compared to adjacent regions. The majority of CoREs directly interact with the DNA in the available 3D structures, which is further supported by literature evidence, analyses of ΔΔG values of DNA-binding energies and threading-based prediction of DNA binding potential. CoREs were enriched in cancer-associated missense mutations, further strengthening their functional nature. Significant enrichment of glycines in CoREs and the preference of glycyl ϕ-Ψ values within the left-handed bridge range in the l-disallowed region of the Ramachandran plot suggest that Gly-to-nonGly mutations within CoREs might alter the backbone conformation and consequently the function, a hypothesis that we reconciled using available mutation data. We conclude that CoREs might serve as bait for DNA recognition by long disordered regions and that certain mutations in these peptides can disrupt their DNA binding potential and consequently the protein function. We further hypothesize that the preferred conformations of CoREs and of glycyl residues therein might play an important role in DNA binding. The highly ordered nature of CoREs hints at a therapeutic strategy to inhibit malicious molecular interactions using small molecules mimicking CoRE conformations.


Assuntos
Biologia Computacional/métodos , Proteínas de Ligação a DNA/química , DNA/metabolismo , Proteínas Intrinsicamente Desordenadas/química , Peptídeos/química , Algoritmos , Motivos de Aminoácidos , Sítios de Ligação , Sequência Conservada , Proteínas de Ligação a DNA/metabolismo , Proteínas Intrinsicamente Desordenadas/metabolismo , Modelos Moleculares , Peptídeos/metabolismo , Conformação Proteica , Análise de Sequência de Proteína
3.
BMC Syst Biol ; 6: 10, 2012 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-22309974

RESUMO

BACKGROUND: We consider the possibility of engineering metabolic pathways in a chassis organism in order to synthesize novel target compounds that are heterologous to the chassis. For this purpose, we model metabolic networks through hypergraphs where reactions are represented by hyperarcs. Each hyperarc represents an enzyme-catalyzed reaction that transforms set of substrates compounds into product compounds. We follow a retrosynthetic approach in order to search in the metabolic space (hypergraphs) for pathways (hyperpaths) linking the target compounds to a source set of compounds. RESULTS: To select the best pathways to engineer, we have developed an objective function that computes the cost of inserting a heterologous pathway in a given chassis organism. In order to find minimum-cost pathways, we propose in this paper two methods based on steady state analysis and network topology that are to the best of our knowledge, the first to enumerate all possible heterologous pathways linking a target compounds to a source set of compounds. In the context of metabolic engineering, the source set is composed of all naturally produced chassis compounds (endogenuous chassis metabolites) and the target set can be any compound of the chemical space. We also provide an algorithm for identifying precursors which can be supplied to the growth media in order to increase the number of ways to synthesize specific target compounds. CONCLUSIONS: We find the topological approach to be faster by several orders of magnitude than the steady state approach. Yet both methods are generally scalable in time with the number of pathways in the metabolic network. Therefore this work provides a powerful tool for pathway enumeration with direct application to biosynthetic pathway design.


Assuntos
Algoritmos , Enzimas/metabolismo , Engenharia Metabólica/métodos , Redes e Vias Metabólicas/fisiologia , Modelos Biológicos
4.
Proteins ; 78(13): 2769-80, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20635423

RESUMO

In a variety of threading methods, often poorly ranked (low z-score) templates have good alignments. Here, a new method, TASSER_low-zsc that identifies these low z-score-ranked templates to improve protein structure prediction accuracy, is described. The approach consists of clustering of threading templates by affinity propagation on the basis of structural similarity (thread_cluster) followed by TASSER modeling, with final models selected by using a TASSER_QA variant. To establish the generality of the approach, templates provided by two threading methods, SP(3) and SPARKS(2), are examined. The SP(3) and SPARKS(2) benchmark datasets consist of 351 and 357 medium/hard proteins (those with moderate to poor quality templates and/or alignments) of length < or =250 residues, respectively. For SP(3) medium and hard targets, using thread_cluster, the TM-scores of the best template improve by approximately 4 and 9% over the original set (without low z-score templates) respectively; after TASSER modeling/refinement and ranking, the best model improves by approximately 7 and 9% over the best model generated with the original template set. Moreover, TASSER_low-zsc generates 22% (43%) more foldable medium (hard) targets. Similar improvements are observed with low-ranked templates from SPARKS(2). The template clustering approach could be applied to other modeling methods that utilize multiple templates to improve structure prediction.


Assuntos
Algoritmos , Modelos Moleculares , Conformação Proteica , Proteínas/química , Caspase 8/química , Análise por Conglomerados , Biologia Computacional/métodos , Simulação por Computador , Estrutura Terciária de Proteína , Reprodutibilidade dos Testes
5.
Bioinformatics ; 26(5): 687-8, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-20080513

RESUMO

UNLABELLED: In the post-genomic era, the annotation of protein function facilitates the understanding of various biological processes. To extend the range of function annotation methods to the twilight zone of sequence identity, we have developed approaches that exploit both protein tertiary structure and/or protein sequence evolutionary relationships. To serve the scientific community, we have integrated the structure prediction tools, TASSER, TASSER-Lite and METATASSER, and the functional inference tools, FINDSITE, a structure-based algorithm for binding site prediction, Gene Ontology molecular function inference and ligand screening, EFICAz(2), a sequence-based approach to enzyme function inference and DBD-hunter, an algorithm for predicting DNA-binding proteins and associated DNA-binding residues, into a unified web resource, Protein Structure and Function prediction Resource (PSiFR). AVAILABILITY AND IMPLEMENTATION: PSiFR is freely available for use on the web at http://psifr.cssb.biology.gatech.edu/


Assuntos
Conformação Proteica , Proteínas/química , Software , Sítios de Ligação , Bases de Dados de Proteínas , Proteínas/metabolismo
6.
Proteins ; 77 Suppl 9: 123-7, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19639638

RESUMO

The performance of the protein structure prediction server pro-sp3-TASSER in CASP8 is described. Compared to CASP7, the major improvement in prediction is in the quality of input models to TASSER. These improvements are due to the PRO-SP(3) threading method, the improved quality of contact predictions provided by TASSER_2.0, multiple short TASSER simulations for building the full-length model, and the accuracy of model selection using the TASSER-QA quality assessment method. Finally, we analyze the overall performance and highlight some successful predictions of the pro-sp3-TASSER server.


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Proteínas/química , Análise de Sequência de Proteína/métodos , Software , Conformação Proteica
7.
Proteins ; 69 Suppl 8: 90-7, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17705276

RESUMO

An improved TASSER (Threading/ASSEmbly/Refinement) methodology is applied to predict the tertiary structure for all CASP7 targets. TASSER employs template identification by threading, followed by tertiary structure assembly by rearranging continuous template fragments, where conformational space is searched via Parallel Hyperbolic Monte Carlo sampling with an optimized force-field that includes knowledge-based statistical potentials and restraints derived from threading templates. The final models are selected by clustering structures from the low temperature replicas. Improvements in TASSER over CASP6 involve use of better templates from 3D-jury applied to three threading programs, PROSPECTOR_3, SP(3), and SPARKS, and a fragment comparison method for better model ranking. For targets with no reliable templates, a variant of TASSER (chunk-TASSER) is also applied with potentials and restraints extracted from ab initio folded supersecondary chunks of the target to build full-length models. For all 124 CASP targets/domains, the average root-mean-square-deviation (RMSD) from native and alignment coverage of the best initial threading models from 3D-jury are 6.2 A and 93%, respectively. Following TASSER reassembly, the average RMSD of the best model in the template aligned region decreases to 4.9 A and the average TM-score increases from 0.617 for the template to 0.678 for the best full-length model. Based on target difficulty, the average TM-scores of the final model to native are 0.904, 0.671, and 0.307 for high-accuracy template-based modeling, template-based modeling, and free modeling targets/domains, respectively. For the more difficult targets, TASSER with modest human intervention performed better in comparison to its server counterpart, MetaTASSER, which used a limited time simulation.


Assuntos
Biologia Computacional/métodos , Estrutura Terciária de Proteína , Simulação por Computador , Modelos Moleculares , Dobramento de Proteína , Proteínas/química
8.
In Silico Biol ; 5(4): 379-87, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16268782

RESUMO

During the course of our large-scale genome analysis a conserved domain, currently detectable only in the genomes of Drosophila melanogaster, Caenorhabditis elegans and Anopheles gambiae, has been identified. The function of this domain is currently unknown and no function annotation is provided for this domain in the publicly available genomic, protein family and sequence databases. The search for the homologues of this domain in the non-redundant sequence database using PSI-BLAST, resulted in identification of distant relationship between this family and the alkaline phosphatase-like superfamily, which includes families of aryl sulfatase, N-acetylgalactosomine-4-sulfatase, alkaline phosphatase and 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (iPGM). The fold recognition procedures showed that this new domain could adopt a similar 3-D fold as for this superfamily. Most of the phosphatases and sulfatases of this superfamily are characterized by functional residues Ser and Cys respectively in the topologically equivalent positions. This functionally important site aligns with Ser/Thr in the members of the new family. Additionally, set of residues responsible for a metal binding site in phosphatases and sulphtases are conserved in the new family. The in-depth analysis suggests that the new family could possess phosphatase activity.


Assuntos
Fosfatase Alcalina , Genoma de Inseto , Genoma de Protozoário , Fosfatase Alcalina/química , Fosfatase Alcalina/classificação , Fosfatase Alcalina/genética , Sequência de Aminoácidos , Animais , Bases de Dados Factuais , Modelos Moleculares , Dados de Sequência Molecular , Família Multigênica , Estrutura Terciária de Proteína , Alinhamento de Sequência
9.
In Silico Biol ; 4(4): 445-60, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15506994

RESUMO

In order to bridge the gap between proteins with three-dimensional (3-D) structural information and those without 3-D structures, extensive experimental and computational efforts for structure recognition are being invested. One of the rapid and simple computational approaches for structure recognition makes use of sequence profiles with sensitive profile matching procedures to identify remotely related homologous families. While adopting this approach we used profiles that are generated from structure-based sequence alignment of homologous protein domains of known structures integrated with sequence homologues. We present an assessment of this fast and simple approach. About one year ago, using this approach, we had identified structural homologues for 315 sequence families, which were not known to have any 3-D structural information. The subsequent experimental structure determination for at least one of the members in 110 of 315 sequence families allowed a retrospective assessment of the correctness of structure recognition. We demonstrate that correct folds are detected with an accuracy of 96.4% (106/110). Most (81/106) of the associations are made correctly to the specific structural family. For 23/106, the structure associations are valid at the superfamily level. Thus, profiles of protein families of known structure when used with sensitive profile-based search procedure result in structure association of high confidence. Further assignment at the level of superfamily or family would provide clues to probable functions of new proteins. Importantly, the public availability of these profiles from us could enable one to perform genome wide structure assignment in a local machine in a fast and accurate manner.


Assuntos
Biologia Computacional/métodos , Conformação Proteica , Alinhamento de Sequência/métodos , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dados de Sequência Molecular , Dobramento de Proteína
10.
BMC Bioinformatics ; 5: 28, 2004 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-15113407

RESUMO

BACKGROUND: SUPFAM database is a compilation of superfamily relationships between protein domain families of either known or unknown 3-D structure. In SUPFAM, sequence families from Pfam and structural families from SCOP are associated, using profile matching, to result in sequence superfamilies of known structure. Subsequently all-against-all family profile matches are made to deduce a list of new potential superfamilies of yet unknown structure. DESCRIPTION: The current version of SUPFAM (release 1.4) corresponds to significant enhancements and major developments compared to the earlier and basic version. In the present version we have used RPS-BLAST, which is robust and sensitive, for profile matching. The reliability of connections between protein families is ensured better than before by use of benchmarked criteria involving strict e-value cut-off and a minimal alignment length condition. An e-value based indication of reliability of connections is now presented in the database. Web access to a RPS-BLAST-based tool to associate a query sequence to one of the family profiles in SUPFAM is available with the current release. In terms of the scientific content the present release of SUPFAM is entirely reorganized with the use of 6190 Pfam families and 2317 structural families derived from SCOP. Due to a steep increase in the number of sequence and structural families used in SUPFAM the details of scientific content in the present release are almost entirely complementary to previous basic version. Of the 2286 families, we could relate 245 Pfam families with apparently no structural information to families of known 3-D structures, thus resulting in the identification of new families in the existing superfamilies. Using the profiles of 3904 Pfam families of yet unknown structure, an all-against-all comparison involving sequence-profile match resulted in clustering of 96 Pfam families into 39 new potential superfamilies. CONCLUSION: SUPFAM presents many non-trivial superfamily relationships of sequence families involved in a variety of functions and hence the information content is of interest to a wide scientific community. The grouping of related proteins without a known structure in SUPFAM is useful in identifying priority targets for structural genomics initiatives and in the assignment of putative functions. Database URL: http://pauling.mbu.iisc.ernet.in/~supfam.


Assuntos
Sequência de Aminoácidos , Bases de Dados de Proteínas/tendências , Peptídeos/química , Proteínas/química , Biologia Computacional/métodos , Estrutura Terciária de Proteína
11.
In Silico Biol ; 4(4): 563-72, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15752073

RESUMO

A family of hypothetical proteins, identified predominantly from archaeal genomes, has been analyzed in order to understand its functional characteristics. Using extensive sequence similarity searches it is inferred that this family is remotely related (best sequence identity is 19%) to ClpP proteinases that belongs to serine proteinase class. This family of hypothetical proteins is referred to as SDH proteinase family based on conserved sequential order of Ser, Asp and His residues and predicted serine proteinase activity. Results of fold recognition of SDH family sequences confirmed the remote relationship between SDH proteinases and Clp proteinases and revealed similar tertiary location of putative catalytic triad residues critical for serine proteinase function. However, the best sequence alignment we could obtain suggests that while catalytic Ser is conserved across Clp and SDH proteinases the location of the other catalytic triad residues, namely, His and Asp are swapped in their amino acid alignment positions and hence in 3-D structure. The evidence of conserved catalytic triad suggests that SDH could be a new family of serine proteinases with the fold of Clp proteinase, however sharing the catalytic triad order of carboxypeptidase clan. Signal peptide sequence identified at the N-terminus of some of the homologues suggests that these might be secretory serine proteinases involved in cleavage of extracellular proteins while the remote homologues, ClpP proteinases, are known to work in intracellular environment.


Assuntos
Bactérias/enzimologia , Proteínas de Bactérias/química , Proteínas de Bactérias/classificação , Biologia Computacional , Serina Endopeptidases/química , Serina Endopeptidases/classificação , Sequência de Aminoácidos , Proteínas Arqueais/química , Proteínas Arqueais/classificação , Proteínas Arqueais/genética , Proteínas de Bactérias/genética , Genoma Arqueal , Dados de Sequência Molecular , Filogenia , Dobramento de Proteína , Sinais Direcionadores de Proteínas/genética , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Serina Endopeptidases/genética
12.
Proteins ; 52(4): 585-97, 2003 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-12910458

RESUMO

The members of the family of G-proteins are characterized by their ability to bind and hydrolyze guanosine triphosphate (GTP) to guanosine diphosphate (GDP). Despite a common biochemical function of GTP hydrolysis shared among the members of the family of G-proteins, they are associated with diverse biological roles. The current work describes the identification and detailed analysis of the putative G-proteins encoded in the completely sequenced prokaryotic genomes. Inferences on the biological roles of these G-proteins have been obtained by their classification into known functional subfamilies. We have identified 497 G-proteins in 42 genomes. Seven small GTP-binding protein homologues have been identified in prokaryotes with at least two of the diagnostic sequence motifs of G-proteins conserved. The translation factors have the largest representation (234 sequences) and are found to be ubiquitous, which is consistent with their critical role in protein synthesis. The GTP_OBG subfamily comprises of 79 sequences in our dataset. A total of 177 sequences belong to the subfamily of GTPase of unknown function and 154 of these could be associated with domains of known functions such as cell cycle regulation and t-RNA modification. The large GTP-binding proteins and the alpha-subunit of heterotrimeric G-proteins are not detected in the genomes of the prokaryotes surveyed.


Assuntos
Proteínas de Ligação ao GTP/genética , Genoma Arqueal , Genoma Bacteriano , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Bases de Dados Genéticas , Proteínas de Ligação ao GTP/fisiologia , Família Multigênica/genética , Filogenia
13.
Nucleic Acids Res ; 31(1): 486-8, 2003 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-12520058

RESUMO

The database of Phylogeny and ALIgnment of homologous protein structures (PALI) contains three-dimensional (3-D) structure-dependent sequence alignments as well as structure-based phylogenetic trees of protein domains in various families. The latest updated version (Release 2.1) comprises of 844 families of homologous proteins involving 3863 protein domain structures with each of these families having at least two members. Each member in a family has been structurally aligned with every other member in the same family using two proteins at a time. In addition, an alignment of multiple structures has also been performed using all the members in a family. Every family with at least three members is associated with two dendrograms, one based on a structural dissimilarity metric and the other based on similarity of topologically equivalenced residues for every pairwise alignment. Apart from these multi-member families, there are 817 single member families in the updated version of PALI. A new feature in the current release of PALI is the integration, with 3-D structural families, of sequences of homologues from the sequence databases. Alignments between homologous proteins of known 3-D structure and those without an experimentally derived structure are also provided for every family in the enhanced version of PALI. The database with several web interfaced utilities can be accessed at: http://pauling.mbu.iisc.ernet.in/~pali.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Homologia Estrutural de Proteína , Animais , Filogenia , Estrutura Terciária de Proteína , Proteínas/classificação , Proteínas/genética , Alinhamento de Sequência , Interface Usuário-Computador
14.
Nucleic Acids Res ; 30(1): 289-93, 2002 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-11752317

RESUMO

Members of a superfamily of proteins could result from divergent evolution of homologues with insignificant similarity in the amino acid sequences. A superfamily relationship is detected commonly after the three-dimensional structures of the proteins are determined using X-ray analysis or NMR. The SUPFAM database described here relates two homologous protein families in a multiple sequence alignment database of either known or unknown structure. The present release (1.1), which is the first version of the SUPFAM database, has been derived by analysing Pfam, which is one of the commonly used databases of multiple sequence alignments of homologous proteins. The first step in establishing SUPFAM is to relate Pfam families with the families in PALI, which is an alignment database of homologous proteins of known structure that is derived largely from SCOP. The second step involves relating Pfam families which could not be associated reliably with a protein superfamily of known structure. The profile matching procedure, IMPALA, has been used in these steps. The first step resulted in identification of 1280 Pfam families (out of 2697, i.e. 47%) which are related, either by close homologous connection to a SCOP family or by distant relationship to a SCOP family, potentially forming new superfamily connections. Using the profiles of 1417 Pfam families with apparently no structural information, an all-against-all comparison involving a sequence-profile match using IMPALA resulted in clustering of 67 homologous protein families of Pfam into 28 potential new superfamilies. Expansion of groups of related proteins of yet unknown structural information, as proposed in SUPFAM, should help in identifying 'priority proteins' for structure determination in structural genomics initiatives to expand the coverage of structural information in the protein sequence space. For example, we could assign 858 distinct Pfam domains in 2203 of the gene products in the genome of Mycobacterium tubercolosis. Fifty-one of these Pfam families of unknown structure could be clustered into 17 potentially new superfamilies forming good targets for structural genomics. SUPFAM database can be accessed at http://pauling.mbu.iisc.ernet.in/~supfam.


Assuntos
Bases de Dados de Proteínas , Genoma , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética , Animais , Previsões , Genoma Bacteriano , Imageamento Tridimensional , Armazenamento e Recuperação da Informação , Internet , Mycobacterium tuberculosis/genética , Proteínas/fisiologia , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...