Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
J Bioinform Comput Biol ; 6(2): 335-45, 2008 Apr.
Article in English | MEDLINE | ID: mdl-18464326

ABSTRACT

Measuring the accuracy of protein three-dimensional structures is one of the most important problems in protein structure prediction. For structure-based drug design, the accuracy of the binding site is far more important than the accuracy of any other region of the protein. We have developed an automated method for assessing the quality of a protein model by focusing on the set of residues in the small molecule binding site. Small molecule binding sites typically involve multiple regions of the protein coming together in space, and their accuracy has been observed to be sensitive to even small alignment errors. In addition, ligand binding sites contain the critical information required for drug design, making their accuracy particularly important. We analyzed the accuracy of the binding sites on two sets of protein models: the predictions submitted by the top-performing CASP7 groups, and the models generated by four widely used homology modeling packages. The results of our CASP7 analysis significantly differ from the previous findings, implying that the binding site measure does not correlate with the traditional model quality measures used in the structure prediction benchmarks. For the modeling programs, the resolution of binding sites is extremely sensitive to the degree of sequence homology between the query and the template, even when the most accurate alignments are used in the homology modeling process.


Subject(s)
Protein Conformation , Proteins/chemistry , Animals , Computational Biology , Databases, Protein , Humans , Models, Molecular
2.
Proteins ; 65(4): 953-8, 2006 Dec 01.
Article in English | MEDLINE | ID: mdl-17006949

ABSTRACT

We present a novel, knowledge-based method for the side-chain addition step in protein structure modeling. The foundation of the method is a conditional probability equation, which specifies the probability that a side-chain will occupy a specific rotamer state, given a set of evidence about the rotamer states adopted by the side-chains at aligned positions in structurally homologous crystal structures. We demonstrate that our method increases the accuracy of homology model side-chain addition when compared with the widely employed practice of preserving the side-chain conformation from the homology template to the target at conserved residue positions. Furthermore, we demonstrate that our method accurately estimates the probability that the correct rotamer state has been selected. This interesting result implies that our method can be used to understand the reliability of each and every side-chain in a protein homology model.


Subject(s)
Models, Molecular , Proteins/chemistry , Sequence Alignment/methods , Structural Homology, Protein , Amino Acid Sequence , Computer Simulation , Databases, Protein , Protein Conformation , Sequence Homology, Amino Acid
3.
J Chem Inf Model ; 46(4): 1871-6, 2006.
Article in English | MEDLINE | ID: mdl-16859318

ABSTRACT

Advances in protein crystallography and homology modeling techniques are producing vast amounts of high resolution protein structure data at ever increasing rates. As such, the ability to quickly and easily extract structural similarities is a key tool in discovering important functional relationships. We report on an approach for creating and maintaining a database of pairwise structure alignments for a comprehensive database comprising the PDB and homology models for the human and select pathogen genomes. Our approach consists of a novel, multistage method for determining pairwise structural similarity coupled with an efficient clustering protocol that approximates a full NxN assessment in a fraction of the time. Since biologists are commonly interested in recently released structures, and the homology models built from them, an automatically updating database of structural alignments has great value. Our approach yields a querying system that allows scientists to retrieve databank-wide protein structure similarities as easily as retrieving protein sequence similarities via BLAST or PSI-BLAST. Basic, noncommercial access to the database can be requested at https://tip.eidogen-sertanty.com/.


Subject(s)
Databases, Protein , Protein Conformation , Models, Chemical
4.
Proteins ; 64(4): 960-7, 2006 Sep 01.
Article in English | MEDLINE | ID: mdl-16786595

ABSTRACT

STRUCTFAST is a novel profile-profile alignment algorithm capable of detecting weak similarities between protein sequences. The increased sensitivity and accuracy of the STRUCTFAST method are achieved through several unique features. First, the algorithm utilizes a novel dynamic programming engine capable of incorporating important information from a structural family directly into the alignment process. Second, the algorithm employs a rigorous analytical formula for profile-profile scoring to overcome the limitations of ad hoc scoring functions that require adjustable parameter training. Third, the algorithm employs Convergent Island Statistics (CIS) to compute the statistical significance of alignment scores independently for each pair of sequences. STRUCTFAST routinely produces alignments that meet or exceed the quality obtained by an expert human homology modeler, as evidenced by its performance in the latest CAFASP4 and CASP6 blind prediction benchmark experiments.


Subject(s)
Proteins/chemistry , Sequence Alignment/methods , Sequence Homology, Amino Acid , Algorithms , Software
5.
Bioinformatics ; 21(12): 2827-31, 2005 Jun 15.
Article in English | MEDLINE | ID: mdl-15817690

ABSTRACT

MOTIVATION: Background distribution statistics for profile-based sequence alignment algorithms cannot be calculated analytically, and hence such algorithms must resort to measuring the significance of an alignment score by assessing its location among a distribution of background alignment scores. The Gumbel parameters that describe this background distribution are usually pre-computed for a limited number of scoring systems, gap schemes, and sequence lengths and compositions. The use of such look-ups is known to introduce errors, which compromise the significance assessment of a remote homology relationship. One solution is to estimate the background distribution for each pair of interest by generating a large number of sequence shuffles and use the distribution of their scores to approximate the parameters of the underlying extreme value distribution. This is computationally very expensive, as a large number of shuffles are needed to precisely estimate the score statistics. RESULTS: Convergent Island Statistics (CIS) is a computationally efficient solution to the problem of calculating the Gumbel distribution parameters for an arbitrary pair of sequences and an arbitrary set of gap and scoring schemes. The basic idea behind our method is to recognize the lack of similarity for any pair of sequences early in the shuffling process and thus save on the search time. The method is particularly useful in the context of profile-profile alignment algorithms where the normalization of alignment scores has traditionally been a challenging task. CONTACT: aleksandar@eidogen.com SUPPLEMENTARY INFORMATION: http://www.eidogen-sertanty.com/Documents/convergent_island_stats_sup.pdf.


Subject(s)
Algorithms , Models, Chemical , Models, Statistical , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Amino Acid Sequence , Computer Simulation , Models, Molecular , Molecular Sequence Data , Sequence Homology, Amino Acid
6.
Proc Natl Acad Sci U S A ; 99(10): 6579-84, 2002 May 14.
Article in English | MEDLINE | ID: mdl-12011422

ABSTRACT

Although incorporation of amino acid analogs provides a powerful means of producing new protein structures with interesting functions, many amino acid analogs cannot be incorporated easily by using the wild-type aminoacyl-tRNA synthetase (aaRS). To be able to incorporate specific amino acid analogs site-specifically, it is useful to build a mutant aaRS that preferentially activates the analog compared with the natural amino acids. Experimental combinatorial studies to find such mutant aaRSs have been successful but can easily become costly and time-consuming. In this article, we describe the clash opportunity progressive (COP) computational method for designing a mutant aaRS to preferentially take up the analog compared with the natural amino acids. To illustrate this COP procedure, we apply it to the design of mutant Methanococcus jannaschii tyrosyl-tRNA synthetase (M.jann-TyrRS). Because the three-dimensional structure for M.jann-TyrRS was not available, we used the STRUCTFAST homology modeling procedure plus molecular dynamics with continuum solvent forces to predict the structure of wild-type M.jann-TyrRS. We validate this structure by predicting the binding site for tyrosine and calculating the binding energies of the 20 natural amino acids, which shows that tyrosine binds the strongest. With the COP design algorithm we then designed a mutant tyrosyl tRNA synthetase to activate O-methyl-l-tyrosine preferentially compared with l-tyrosine. This mutant [Y32Q, D158A] is similar to the mutant designed with combinatorial experiments, [Y32Q, D158A, E107T, L162P], by Wang et al. [Wang, L., Brock, A., Herberich, B. & Schultz, P. G. (2001) Science 292, 498-500]. We predict that the new one will have much greater activity while retaining significant discrimination between O-methyl-l-tyrosine and tyrosine.


Subject(s)
Methanococcus/enzymology , Methyltyrosines/chemistry , Tyrosine-tRNA Ligase/chemistry , Amino Acids , Crystallography, X-Ray , Models, Molecular , Mutagenesis , Protein Structure, Tertiary , Tyrosine-tRNA Ligase/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...