Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 24(20): 2308-16, 2008 Oct 15.
Article in English | MEDLINE | ID: mdl-18723520

ABSTRACT

MOTIVATION: Accurate computational prediction of protein functional sites is critical to maximizing the utility of recent high-throughput sequencing efforts. Among the available approaches, position-specific conservation scores remain among the most popular due to their accuracy and ease of computation. Unfortunately, high false positive rates remain a limiting factor. Using phylogenetic motifs (PMs), we have developed two combined (conservation + PMs) prediction schemes that significantly improve prediction accuracy. RESULTS: Our first approach, called position-specific MINER (psMINER), rank orders alignment columns by conservation. Subsequently, positions that are also not identified as PMs are excluded from the prediction set. This approach improves prediction accuracy, in a statistically significant way, compared to the underlying conservation scores. Increased accuracy is a general result, meaning improvement is observed over several different conservation scores that span a continuum of complexity. In addition, a hybrid MINER (hMINER) that quantitatively considers both scoring regimes provides further improvement. More importantly, it provides critical insight into the relative importance of phylogeny versus alignment conservation. Both methods outperform other common prediction algorithms that also utilize phylogenetic concepts. Finally, we demonstrate that the presented results are critically sensitive to functional site definition, thus highlighting the need for more complete benchmarks within the prediction community.


Subject(s)
Phylogeny , Proteins/chemistry , Algorithms , Amino Acid Motifs , Binding Sites , Computational Biology/methods , Conserved Sequence , Databases, Protein , Protein Conformation , Protein Structure, Tertiary , Proteins/classification , Proteins/genetics , Sequence Alignment/methods , Sequence Analysis, Protein
2.
J Bioinform Comput Biol ; 4(1): 19-42, 2006 Feb.
Article in English | MEDLINE | ID: mdl-16568540

ABSTRACT

With the advent of experimental technologies like chemical cross-linking, it has become possible to obtain distances between specific residues of a newly sequenced protein. These types of experiments usually are less time consuming than X-ray crystallography or NMR. Consequently, it is highly desired to develop a method that incorporates this distance information to improve the performance of protein threading methods. However, protein threading with profiles in which constraints on distances between residues are given is known to be NP-hard. By using the notion of a maximum edge-weight clique finding algorithm, we introduce a more efficient method called FTHREAD for profile threading with distance constraints that is 18 times faster than its predecessor CLIQUETHREAD. Moreover, we also present a novel practical algorithm NTHREAD for profile threading with Non-strict constraints. The overall performance of FTHREAD on a data set shows that although our algorithm uses a simple threading function, our algorithm performs equally well as some of the existing methods. Particularly, when there are some unsatisfied constraints, NTHREAD (Non-strict constraints threading algorithm) performs better than threading with FTHREAD (Strict constraints threading algorithm). We have also analyzed the effects of using a number of distance constraints. This algorithm helps the enhancement of alignment quality between the query sequence and template structure, once the corresponding template structure is determined for the target sequence.


Subject(s)
Algorithms , Proteins/chemistry , Amino Acid Sequence , Computational Biology , Cross-Linking Reagents , Molecular Structure , Proteins/genetics
3.
Genome Inform ; 17(1): 3-12, 2006.
Article in English | MEDLINE | ID: mdl-17503351

ABSTRACT

In this paper, we present several methods for computing a solution to the protein side chain packing problem, with all methods having a common solution approach of breaking the polymer into subpolymers and using maximum edge weight cliques to prune the search space for the optimal side chain packing. We characterize the graph sizes generated for each method and compare their prediction accuracies. These methods are demonstrated for computing proteins up to approximately 8000 residues. In addition, we update a result published previously.


Subject(s)
Computational Biology/methods , Oligopeptides/chemistry , Proteins/chemistry , Algorithms , Amino Acids/chemistry , Models, Chemical , Models, Molecular , Predictive Value of Tests , Protein Conformation
4.
J Bioinform Comput Biol ; 3(1): 103-26, 2005 Feb.
Article in English | MEDLINE | ID: mdl-15751115

ABSTRACT

"Protein Side-chain Packing" has an ever-increasing application in the field of bio-informatics, dating from the early methods of homology modeling to protein design and to the protein docking. However, this problem is computationally known to be NP-hard. In this regard, we have developed a novel approach to solve this problem using the notion of a maximum edge-weight clique. Our approach is based on efficient reduction of protein side-chain packing problem to a graph and then solving the reduced graph to find the maximum clique by applying an efficient clique finding algorithm developed by our co-authors. Since our approach is based on deterministic algorithms in contrast to the various existing algorithms based on heuristic approaches, our algorithm guarantees of finding an optimal solution. We have tested this approach to predict the side-chain conformations of a set of proteins and have compared the results with other existing methods. We have found that our results are favorably comparable or better than the results produced by the existing methods. As our test set contains a protein of 494 residues, we have obtained considerable improvement in terms of size of the proteins and in terms of the efficiency and the accuracy of prediction.


Subject(s)
Algorithms , Crystallography/methods , Models, Chemical , Models, Molecular , Proteins/analysis , Proteins/chemistry , Sequence Analysis, Protein/methods , Computer Simulation , Likelihood Functions , Protein Conformation , Protein Structure, Secondary , Sequence Alignment/methods , Sequence Homology, Amino Acid
5.
Genome Inform ; 13: 143-52, 2002.
Article in English | MEDLINE | ID: mdl-14571383

ABSTRACT

We developed maximum clique-based algorithms for spot matching for two-dimensional gel electrophoresis images, protein structure alignment and protein side-chain packing, where these problems are known to be NP-hard. Algorithms based on direct reductions to the maximum clique can find optimal solutions for instances of size (the number of points or residues) up to 50-150 using a standard PC. We also developed pre-processing techniques to reduce the sizes of graphs. Combined with some heuristics, many realistic instances can be solved approximately.


Subject(s)
Algorithms , Data Interpretation, Statistical , Protein Structure, Tertiary , Sequence Analysis, Protein/methods , Animals , Computational Biology/methods , Electrophoresis, Gel, Two-Dimensional/methods , Humans , Sequence Alignment/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...