Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 36
Filter
1.
Appl Microbiol Biotechnol ; 95(6): 1479-89, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22218769

ABSTRACT

To expand the available set of Baeyer-Villiger monooxygenases (BVMOs), we have created expression constructs for producing 22 Type I BVMOs that are present in the genome of Rhodococcus jostii RHA1. Each BVMO has been probed with a large panel of potential substrates. Except for testing their substrate acceptance, also the enantioselectivity of some selected BVMOs was studied. The results provide insight into the biocatalytic potential of this collection of BVMOs and expand the biocatalytic repertoire known for BVMOs. This study also sheds light on the catalytic capacity of this large set of BVMOs that is present in this specific actinomycete. Furthermore, a comparative sequence analysis revealed a new BVMO-typifying sequence motif. This motif represents a useful tool for effective future genome mining efforts.


Subject(s)
Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Cloning, Molecular , Mixed Function Oxygenases/chemistry , Mixed Function Oxygenases/genetics , Rhodococcus/enzymology , Amino Acid Motifs , Amino Acid Sequence , Bacterial Proteins/metabolism , Gene Expression , Kinetics , Mixed Function Oxygenases/metabolism , Molecular Sequence Data , Phylogeny , Rhodococcus/chemistry , Rhodococcus/classification , Rhodococcus/genetics , Sequence Homology, Amino Acid , Substrate Specificity
2.
Protein Eng ; 14(10): 717-21, 2001 Oct.
Article in English | MEDLINE | ID: mdl-11739889

ABSTRACT

Using a recent version of the SICHO algorithm for in silico protein folding, we made a blind prediction of the tertiary structure of the N-terminal, independently folded, catalytic domain (CD) of the I-TevI homing endonuclease, a representative of the GIY-YIG superfamily of homing endonucleases. The secondary structure of the I-TevI CD has been determined using NMR spectroscopy, but computational sequence analysis failed to detect any protein of known tertiary structure related to the GIY-YIG nucleases (Kowalski et al., Nucleic Acids Res., 1999, 27, 2115-2125). To provide further insight into the structure-function relationships of all GIY-YIG superfamily members, including the recently described subfamily of type II restriction enzymes (Bujnicki et al., Trends Biochem. Sci., 2000, 26, 9-11), we incorporated the experimentally determined and predicted secondary and tertiary restraints in a reduced (side chain only) protein model, which was minimized by Monte Carlo dynamics and simulated annealing. The subsequently elaborated full atomic model of the I-TevI CD allows the available experimental data to be put into a structural context and suggests that the GIY-YIG domain may dimerize in order to bring together the conserved residues of the active site.


Subject(s)
Endodeoxyribonucleases/chemistry , Models, Molecular , Algorithms , Binding Sites , Monte Carlo Method , Nuclear Magnetic Resonance, Biomolecular , Protein Structure, Tertiary , Sequence Alignment
3.
Bioinformatics ; 17(12): 1240-1, 2001 Dec.
Article in English | MEDLINE | ID: mdl-11751239

ABSTRACT

UNLABELLED: The ToolShop server offers a possibility to compare a protein tertiary structure prediction server with other popular servers before releasing it to the public. The comparison is conducted on a set of 203 proteins and the collected models are compared with over 20 other programs using various assessment procedures. The evaluation lasts circa one week. AVAILABILITY: The ToolShop server is available at http://BioInfo.PL/ToolShop/. The administrator should be contacted to couple the tested server to the evaluation suite. CONTACT: leszek@bioinfo.pl SUPPLEMENTARY INFORMATION: The evaluation procedures are similar to those implemented in the continuous online server evaluation program, LiveBench. Additional information is available from its homepage (http://BioInfo.PL/LiveBench/).


Subject(s)
Protein Conformation , Proteins/analysis , Software , Sensitivity and Specificity
4.
Protein Sci ; 10(11): 2354-62, 2001 Nov.
Article in English | MEDLINE | ID: mdl-11604541

ABSTRACT

During recent years many protein fold recognition methods have been developed, based on different algorithms and using various kinds of information. To examine the performance of these methods several evaluation experiments have been conducted. These include blind tests in CASP/CAFASP, large scale benchmarks, and long-term, continuous assessment with newly solved protein structures. These studies confirm the expectation that for different targets different methods produce the best predictions, and the final prediction accuracy could be improved if the available methods were combined in a perfect manner. In this article a neural-network-based consensus predictor, Pcons, is presented that attempts this task. Pcons attempts to select the best model out of those produced by six prediction servers, each using different methods. Pcons translates the confidence scores reported by each server into uniformly scaled values corresponding to the expected accuracy of each model. The translated scores as well as the similarity between models produced by different servers is used in the final selection. According to the analysis based on two unrelated sets of newly solved proteins, Pcons outperforms any single server by generating approximately 8%-10% more correct predictions. Furthermore, the specificity of Pcons is significantly higher than for any individual server. From analyzing different input data to Pcons it can be shown that the improvement is mainly attributable to measurement of the similarity between the different models. Pcons is freely accessible for the academic community through the protein structure-prediction metaserver at http://bioinfo.pl/meta/.


Subject(s)
Neural Networks, Computer , Proteins/chemistry , Algorithms , Models, Statistical , Protein Folding , Sensitivity and Specificity
5.
Genome Biol ; 2(9): RESEARCH0038, 2001.
Article in English | MEDLINE | ID: mdl-11574057

ABSTRACT

BACKGROUND: The reovirus lambda2 protein catalyzes mRNA capping, that is, addition of a guanosine to the 5' end of each transcript in a 5'-to-5' orientation, as well as transfer of a methyl group from S-adenosyl-L-methionine (AdoMet) to the N7 atom of the added guanosyl moiety and subsequently to the ribose 2'-O atom of the first template-encoded nucleotide. The structure of the human reovirus core has been solved at 3.6 A resolution, revealing a series of domains that include a putative guanylyltransferase domain and two putative methyltransferase (MTase) domains. It has been suggested that the order of domains in the lambda2 protein corresponds to the order of reactions in the pathway and that the m7G (cap 0) and the 2'-O-ribose (cap 1) MTase activities may be exerted by the MTase 1 and the MTase 2 domains, respectively. RESULTS: We show that the reovirus MTase 1 domain shares a putative active site with the structurally characterized 2'-O-ribose MTases, including vaccinia virus cap 1 MTase, whereas the MTase 2 domain is structurally similar to glycine N-MTase. CONCLUSIONS: On the basis of our analysis of the structural details we propose that the previously suggested functional assignments of the MTase 1 and MTase 2 domains should be swapped.


Subject(s)
Catalytic Domain , Methyltransferases/chemistry , Methyltransferases/metabolism , RNA Caps/metabolism , Reoviridae/enzymology , Viral Core Proteins/metabolism , Animals , Carps , Humans , Nucleotidyltransferases/chemistry , Nucleotidyltransferases/metabolism , RNA Caps/chemistry , Substrate Specificity , Viral Core Proteins/chemistry
6.
BMC Bioinformatics ; 2: 5, 2001.
Article in English | MEDLINE | ID: mdl-11545673

ABSTRACT

BACKGROUND: Prediction of protein structures is one of the fundamental challenges in biology today. To fully understand how well different prediction methods perform, it is necessary to use measures that evaluate their performance. Every two years, starting in 1994, the CASP (Critical Assessment of protein Structure Prediction) process has been organized to evaluate the ability of different predictors to blindly predict the structure of proteins. To capture different features of the models, several measures have been developed during the CASP processes. However, these measures have not been examined in detail before. In an attempt to develop fully automatic measures that can be used in CASP, as well as in other type of benchmarking experiments, we have compared twenty-one measures. These measures include the measures used in CASP3 and CASP2 as well as have measures introduced later. We have studied their ability to distinguish between the better and worse models submitted to CASP3 and the correlation between them. RESULTS: Using a small set of 1340 models for 23 different targets we show that most methods correlate with each other. Most pairs of measures show a correlation coefficient of about 0.5. The correlation is slightly higher for measures of similar types. We found that a significant problem when developing automatic measures is how to deal with proteins of different length. Also the comparisons between different measures is complicated as many measures are dependent on the size of the target. We show that the manual assessment can be reproduced to about 70% using automatic measures. Alignment independent measures, detects slightly more of the models with the correct fold, while alignment dependent measures agree better when selecting the best models for each target. Finally we show that using automatic measures would, to a large extent, reproduce the assessors ranking of the predictors at CASP3. CONCLUSIONS: We show that given a sufficient number of targets the manual and automatic measures would have given almost identical results at CASP3. If the intent is to reproduce the type of scoring done by the manual assessor in in CASP3, the best approach might be to use a combination of alignment independent and alignment dependent measures, as used in several recent studies.


Subject(s)
Computational Biology/standards , Models, Molecular , Computational Biology/methods , Computational Biology/statistics & numerical data , Predictive Value of Tests , Protein Conformation , Protein Structure, Tertiary/genetics
7.
J Struct Biol ; 134(2-3): 219-31, 2001.
Article in English | MEDLINE | ID: mdl-11551181

ABSTRACT

Fold assignments for newly sequenced genomes belong to the most important and interesting applications of the booming field of protein structure prediction. We present a brief survey and a discussion of such assignments completed to date, using as an example several fold assignment projects for proteins from the Escherichia coli genome. This review focuses on steps that are necessary to go beyond the simple assignment projects and into the development of tools extending our understanding of functions of proteins in newly sequenced genomes. This paper also discusses several problems seldom addressed in the literature, such as the problem of domain prediction and complementary predictions (e.g., transmembrane regions and flexible regions) and cross-correlation of predictions from different servers. The influence of sequence and structure database growth on prediction success is also addressed. Finally, we discuss the perspectives of the field in the context of massive sequence and structure determination projects, as well as the development of novel prediction methods.


Subject(s)
Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Escherichia coli/chemistry , Escherichia coli/genetics , Genome, Bacterial , Protein Folding , Algorithms
8.
Acta Microbiol Pol ; 50(1): 7-17, 2001.
Article in English | MEDLINE | ID: mdl-11518396

ABSTRACT

Methylation of G1405 within bacterial 16S ribosomal RNA results in high-level resistance to specific combinations of aminoglycoside antibiotics. Only a few closely related methyltransferases (MTases), which carry out the respective modification (here dubbed "Agr", for aminoglycoside resistance), are known. It is not clear, whether they are related to "typical" S-adenosylmethionine (AdoMet)-dependent MTases or not. Demydchuk et al., 1998 proposed that the cofactor-binding region is localized at the C-terminus of Agr MTases, which implies an interesting case of sequence permutation. Since the Agr MTases lack significant sequence similarity to other proteins, we tested that hypothesis using more sensitive sequence/structure threading approach. Structure prediction confirmed the presence of a putative AdoMet-binding site in these proteins, albeit at a distinct location, resembling that of "typical", non-permuted MTases. Additionally, a small alpha-helical domain dissimilar to other proteins in the database was identified in the N-terminal region of Agr MTases. Comparison of a three-dimensional model of the Agr family member with a recently solved structure of reovirus mRNA capping MTase suggests that the mechanism of guanine-N7 methylation in rRNA and mRNA may be different.


Subject(s)
Anti-Bacterial Agents/pharmacology , Drug Resistance, Microbial/genetics , Methyltransferases/chemistry , Actinomycetales/enzymology , Amino Acid Sequence , Aminoglycosides , Forecasting , Methyltransferases/genetics , Micromonospora/enzymology , Models, Molecular , Molecular Sequence Data , Sequence Analysis, Protein , Sequence Homology, Amino Acid
9.
Bioinformatics ; 17(8): 750-1, 2001 Aug.
Article in English | MEDLINE | ID: mdl-11524381

ABSTRACT

UNLABELLED: The Structure Prediction Meta Server offers a convenient way for biologists to utilize various high quality structure prediction servers available worldwide. The meta server translates the results obtained from remote services into uniform format, which are consequently used to request a jury prediction from a remote consensus server Pcons. AVAILABILITY: The structure prediction meta server is freely available at http://BioInfo.PL/meta/, some remote servers have however restrictions for non-academic users, which are respected by the meta server. SUPPLEMENTARY INFORMATION: Results of several sessions of the CAFASP and LiveBench programs for assessment of performance of fold-recognition servers carried out via the meta server are available at http://BioInfo.PL/services.html.


Subject(s)
Databases, Protein , Proteins/chemistry , Amino Acid Sequence , Computational Biology , Protein Folding , Protein Structure, Secondary , Protein Structure, Tertiary , Proteins/genetics , Software
10.
BMC Bioinformatics ; 2: 2, 2001.
Article in English | MEDLINE | ID: mdl-11472630

ABSTRACT

BACKGROUND: The 5'-terminal cap structure plays an important role in many aspects of mRNA metabolism. Capping enzymes encoded by viruses and pathogenic fungi are attractive targets for specific inhibitors. There is a large body of experimental data on viral and cellular methyltransferases (MTases) that carry out guanine-N7 (cap 0) methylation, including results of extensive mutagenesis. However, a crystal structure is not available and cap 0 MTases are too diverged from other MTases of known structure to allow straightforward homology-based interpretation of these data. RESULTS: We report a 3D model of cap 0 MTase, developed using sequence-to-structure threading and comparative modeling based on coordinates of the glycine N-methyltransferase. Analysis of the predicted structural features in the phylogenetic context of the cap 0 MTase family allows us to rationalize most of the experimental data available and to propose potential binding sites. We identified a case of correlated mutations in the cofactor-binding site of viral MTases that may be important for the rational drug design. Furthermore, database searches and phylogenetic analysis revealed a novel subfamily of hypothetical MTases from plants, distinct from "orthodox" cap 0 MTases. CONCLUSIONS: Computational methods were used to infer the evolutionary relationships and predict the structure of Eukaryotic cap MTase. Identification of novel cap MTase homologs suggests candidates for cloning and biochemical characterization, while the structural model will be useful in designing new experiments to better understand the molecular function of cap MTases.


Subject(s)
Evolution, Molecular , Methyltransferases/genetics , Methyltransferases/isolation & purification , Multigene Family/genetics , RNA Caps , Amino Acid Motifs/genetics , Amino Acid Sequence , Animals , Binding Sites/genetics , Computational Biology/methods , Conserved Sequence/genetics , Databases, Protein , Giardia lamblia/genetics , Guanine/metabolism , Humans , Models, Molecular , Molecular Sequence Data , Mutagenesis, Site-Directed/genetics , Phylogeny , Protein Structure, Quaternary/genetics , Protozoan Proteins/genetics , Saccharomyces cerevisiae Proteins/genetics , Structure-Activity Relationship
11.
Protein Sci ; 10(3): 656-60, 2001 Mar.
Article in English | MEDLINE | ID: mdl-11344334

ABSTRACT

The tRNA splicing endoribonuclease EndA from Methanococcus jannaschii is a homotetramer formed via heterologous interaction between the two pairs of homodimers. Each monomer consists of two alpha/beta domains, the N-terminal domain (NTD) and the C-terminal domain (CTD) containing the RNase A-like active site. Comparison of the EndA coordinates with the publicly available protein structure database revealed the similarity of both domains to site-specific deoxyribonucleases: the NTD to the LAGLIDADG family and the CTD to the PD-(D/E)XK family. Superposition of the NTD on the catalytic domain of LAGLIDADG homing endonucleases allowed a suggestion to be made about which amino acid residues of the tRNA splicing nuclease might participate in formation of a presumptive cryptic deoxyribonuclease active site. On the other hand, the CTD and PD-(D/E)XK endonucleases, represented by restriction enzymes and a phage lambda exonuclease, were shown to share extensive similarities of the structural framework, to which entirely different active sites might be attached in two alternative locations. These findings suggest that EndA evolved from a fusion protein with at least two distinct endonuclease activities: the ribonuclease, which made it an essential "antitoxin" for the cells whose RNA genes were interrupted by introns, and the deoxyribonuclease, which provided the means for homing-like mobility. The residues of the noncatalytic CTDs from the positions corresponding to the catalytic side chains in PD-(D/E)XK deoxyribonucleases map to the surface at the opposite side to the tRNA binding site, for which no function has been implicated. Many restriction enzymes from the PD-(D/E)XK superfamily might have the potential to maintain an additional active or binding site at the face opposite the deoxyribonuclease active site, a property that can be utilized in protein engineering.


Subject(s)
Deoxyribonucleases/chemistry , Endodeoxyribonucleases/chemistry , Endoribonucleases/chemistry , Evolution, Molecular , RNA, Transfer/chemistry , Ribonucleases/chemistry , Binding Sites , Catalytic Domain , DNA Restriction Enzymes , Databases, Factual , Deoxyribonucleases/metabolism , Endodeoxyribonucleases/metabolism , Endoribonucleases/metabolism , Introns/genetics , Methanococcus/enzymology , Protein Structure, Tertiary , RNA, Transfer/metabolism , Ribonucleases/metabolism
12.
Gene ; 267(2): 183-91, 2001 Apr 18.
Article in English | MEDLINE | ID: mdl-11313145

ABSTRACT

The Escherichia coli K-12 restriction enzyme Mrr recognizes and cleaves N6-methyladenine- and 5-methylcytosine-containing DNA. Its amino acid sequence has been subjected to structure prediction and comparison with other sequences from publicly available sources. The results obtained suggest that Mrr and related putative endonucleases possess a cleavage domain typical for all the so far structurally characterized type II restriction enzymes, however with an unusual glutamine residue at the central position of the (D/E)-(D/E)XK hallmark of the active site. The "missing" acidic side chain was instead found anchored in a different, unusual position, suggesting that Mrr represents a third topological variant of the endonuclease active site in addition to the two alternatives determined previously (Skirgaila et al., 1998. J. Mol. Biol. 279, 473-481). One of the newly identified putative endonucleases from the Mrr family is composed of two diverged cleavage domains, which possess both the "typical" D-EXK and the "Mrr-like" D-QXK variants of the active site. Among the Mrr homologs there are also proteins from yeast, in which restriction phenotype has not been observed, suggesting that the free-standing Eukaryotic PD-(D/E)XK superfamily members might be implicated in other cellular processes involving enzymatic DNA cleavage.


Subject(s)
DNA Restriction Enzymes/genetics , Escherichia coli Proteins , Phylogeny , Amino Acid Sequence , Binding Sites , DNA Restriction Enzymes/chemistry , Models, Molecular , Molecular Sequence Data , Molecular Structure , Mutation , Protein Structure, Tertiary , Sequence Alignment , Sequence Homology, Amino Acid
13.
Virus Genes ; 22(2): 219-30, 2001 Mar.
Article in English | MEDLINE | ID: mdl-11324759

ABSTRACT

The PD-(D/E)XK superfamily of deoxyribonucleases (ENases) comprises restriction endonucleases, exonucleases and nicking enzymes, which share a common fold and the architecture of the active site. Their extreme divergence generally hampers identification of novel members based solely on sequence comparisons. Here we report a remote similarity between the phage lambda exonuclease (lambda-exo), branching out early in the evolutionary history of ENases (3), with the family of alkaline exonucleases (AE) encoded by various viruses infecting higher Eukaryota. The predicted structural compatibility and the conservation of the functionally important residues between AE and ENases strongly suggest a distant evolutionary relationship between these proteins. According to the results of extensive sequence database mining, sequence/structure threading and molecular modeling it is plausible that the AE proteins with lambda-exo and some other putative phage-encoded exonucleases form a distinct subfamily of PD-(D/E)XK ENases. The phylogenetic history of this subfamily is inferred using sequence alignment and distance matrix methods.


Subject(s)
Exodeoxyribonucleases/genetics , Herpesviridae/enzymology , Amino Acid Sequence , Animals , Exodeoxyribonucleases/chemistry , Exodeoxyribonucleases/classification , Herpesviridae/genetics , Herpesvirus 1, Human/enzymology , Humans , Molecular Sequence Data , Phylogeny , Protein Structure, Secondary , Sequence Analysis , Structure-Activity Relationship
14.
Protein Sci ; 10(2): 352-61, 2001 Feb.
Article in English | MEDLINE | ID: mdl-11266621

ABSTRACT

We present a novel, continuous approach aimed at the large-scale assessment of the performance of available fold-recognition servers. Six popular servers were investigated: PDB-Blast, FFAS, T98-lib, GenTHREADER, 3D-PSSM, and INBGU. The assessment was conducted using as prediction targets a large number of selected protein structures released from October 1999 to April 2000. A target was selected if its sequence showed no significant similarity to any of the proteins previously available in the structural database. Overall, the servers were able to produce structurally similar models for one-half of the targets, but significantly accurate sequence-structure alignments were produced for only one-third of the targets. We further classified the targets into two sets: easy and hard. We found that all servers were able to find the correct answer for the vast majority of the easy targets if a structurally similar fold was present in the server's fold libraries. However, among the hard targets--where standard methods such as PSI-BLAST fail--the most sensitive fold-recognition servers were able to produce similar models for only 40% of the cases, half of which had a significantly accurate sequence-structure alignment. Among the hard targets, the presence of updated libraries appeared to be less critical for the ranking. An "ideally combined consensus" prediction, where the results of all servers are considered, would increase the percentage of correct assignments by 50%. Each server had a number of cases with a correct assignment, where the assignments of all the other servers were wrong. This emphasizes the benefits of considering more than one server in difficult prediction tasks. The LiveBench program (http://BioInfo.PL/LiveBench) is being continued, and all interested developers are cordially invited to join.


Subject(s)
Databases, Factual , Protein Folding , Software , Computer Simulation , Models, Molecular , Sensitivity and Specificity
15.
J Mol Microbiol Biotechnol ; 3(1): 69-72, 2001 Jan.
Article in English | MEDLINE | ID: mdl-11200231

ABSTRACT

The PD-(D/E)XK nuclease domains, initially identified in type II restriction enzymes, serve as models for studying aspects of protein-DNA interactions, mechanisms of phosphodiester hydrolysis, and provide indispensable tools for techniques in genetic engineering and molecular medicine. However, the low degree of amino acid conservation hampers the possibility of identification of PD-(D/E)XK superfamily members based solely on sequence comparisons. In several proteins implicated in DNA recombination and repair the restriction enzyme-like nuclease domain has been found only after the corresponding structures were determined experimentally. Here, we identified highly diverged variants of the PD-(D/E)XK domain in many proteins and open reading frames using iterative database searches and progressive, structure-guided alignment of sequence profiles. We predicted the possible cellular function for many hypothetical proteins based on their relative similarity to characterized nucleases or observed presence of additional domains. We also identified the nuclease domain in genuine recombinases and restriction enzymes, whose homology to other PD-(D/E)XK enzymes has not been demonstrated previously. The first superfamily-wide comparative analysis, not limited to nucleases of known structure, will guide cloning and characterization of novel enzymes and planning new experiments to better understand those already studied.


Subject(s)
Deoxyribonucleases, Type II Site-Specific/classification , Amino Acid Sequence , Animals , DNA Restriction Enzymes/chemistry , DNA Restriction Enzymes/classification , Deoxyribonucleases, Type II Site-Specific/chemistry , Molecular Sequence Data , Sequence Alignment
16.
Trends Biochem Sci ; 26(1): 9-11, 2001 Jan.
Article in English | MEDLINE | ID: mdl-11165501

ABSTRACT

Using algorithms for protein sequence analysis we predict that some of the canonical type II and type IIS restriction enzymes have an active site with a substantially different architecture and fold from the "typical" PD-(D/E)xK superfamily. These results suggest that they are related to nucleases from the HNH and GIY-YIG superfamilies.


Subject(s)
Bacterial Proteins , Deoxyribonucleases, Type II Site-Specific/chemistry , Evolution, Molecular , Protein Folding , Algorithms , Amino Acid Motifs , Amino Acid Sequence , Binding Sites , DNA (Cytosine-5-)-Methyltransferases/chemistry , DNA (Cytosine-5-)-Methyltransferases/metabolism , Deoxyribonucleases, Type II Site-Specific/metabolism , Histidine/metabolism , Molecular Sequence Data , Protein Conformation , Sequence Homology, Amino Acid , Site-Specific DNA-Methyltransferase (Adenine-Specific)/chemistry , Site-Specific DNA-Methyltransferase (Adenine-Specific)/metabolism
17.
Proteins ; Suppl 5: 171-83, 2001.
Article in English | MEDLINE | ID: mdl-11835495

ABSTRACT

The results of the second Critical Assessment of Fully Automated Structure Prediction (CAFASP2) are presented. The goals of CAFASP are to (i) assess the performance of fully automatic web servers for structure prediction, by using the same blind prediction targets as those used at CASP4, (ii) inform the community of users about the capabilities of the servers, (iii) allow human groups participating in CASP to use and analyze the results of the servers while preparing their nonautomated predictions for CASP, and (iv) compare the performance of the automated servers to that of the human-expert groups of CASP. More than 30 servers from around the world participated in CAFASP2, covering all categories of structure prediction. The category with the largest participation was fold recognition, where 24 CAFASP servers filed predictions along with 103 other CASP human groups. The CAFASP evaluation indicated that it is difficult to establish an exact ranking of the servers because the number of prediction targets was relatively small and the differences among many servers were also small. However, roughly a group of five "best" fold recognition servers could be identified. The CASP evaluation identified the same group of top servers albeit with a slightly different relative order. Both evaluations ranked a semiautomated method named CAFASP-CONSENSUS, that filed predictions using the CAFASP results of the servers, above any of the individual servers. Although the predictions of the CAFASP servers were available to human CASP predictors before the CASP submission deadline, the CASP assessment identified only 11 human groups that performed better than the best server. Furthermore, about one fourth of the top 30 performing groups corresponded to automated servers. At least half of the top 11 groups corresponded to human groups that also had a server in CAFASP or to human groups that used the CAFASP results to prepare their predictions. In particular, the CAFASP-CONSENSUS group was ranked 7. This shows that the automated predictions of the servers can be very helpful to human predictors. We conclude that as servers continue to improve, they will become increasingly important in any prediction process, especially when dealing with genome-scale prediction tasks. We expect that in the near future, the performance difference between humans and machines will continue to narrow and that fully automated structure prediction will become an effective companion and complement to experimental structural genomics.


Subject(s)
Protein Conformation , Software , Automation , Models, Molecular , Protein Structure, Secondary , Protein Structure, Tertiary , Sequence Analysis, Protein , Sequence Homology
18.
Proteins ; Suppl 5: 184-91, 2001.
Article in English | MEDLINE | ID: mdl-11835496

ABSTRACT

The aim of LiveBench is to provide a continuous evaluation of structure prediction servers to inform developers and users about the current state-of-the-art structure prediction tools. LiveBench differs from other evaluation experiments because it is a large-scale and a fully automated procedure. Since LiveBench-1, which finished in April 2000, and related but independent CASP3 and CAFASP1 experiments, significant progress in the field has occurred. Some of the new developments have already been assessed at the recent CASP4 and CAFASP2 experiments (both independently of LiveBench), but others have not been observed yet because they entail developments carried out only recently. These include the availability of new servers (Pcons, FUGUE, and Coblath) and the enhancement of previously existing tools (mGenThreader, Sam-T, and 3D-PSSM), which illustrate the fast rate at which the field is advancing. Consequently, to keep in pace with the development, we present the results of the second large-scale evaluation of protein structure prediction servers. Of the 11 fold recognition servers evaluated, two servers appear to be most sensitive. One of these is 3D-PSSM, a server significantly improved after LiveBench-1. The other top performer is the new consensus server Pcons, which significantly outperformed other servers in the specificity of predictions. LiveBench-2 shows that the top performing servers are able to accurately recognize a fold for about one third of the "difficult" targets, a clear improvement over LiveBench-1 results. Given that automated structure prediction is increasingly becoming a biologists companion, the guidelines drawn from the LiveBench experiments are likely to provide users with valuable and timely information for their prediction needs.


Subject(s)
Protein Conformation , Sequence Analysis, Protein/trends , Software , Automation , Computers , Evaluation Studies as Topic , Protein Structure, Secondary , Protein Structure, Tertiary , Sensitivity and Specificity
19.
Protein Eng ; 13(10): 667-70, 2000 Oct.
Article in English | MEDLINE | ID: mdl-11112504

ABSTRACT

In this commentary, we describe two new protein structure prediction experiments being run in parallel with the CASP experiment, which together may be regarded as the 2000 Olympic Games of structure prediction. The first new experiment is CAFASP, the Critical Assessment of Fully Automated Structure Prediction. In CAFASP, the participants are fully automated programs or Internet servers, and here the automated results of the programs are evaluated, without any human intervention. The second new experiment, named LiveBench, follows the CAFASP ideology in that it is aimed towards the evaluation of automatic servers only, while it runs on a large set of prediction targets and in a continuous fashion. Researchers will be watching the 2000 protein structure prediction Olympic Games, to be held in December, in order to learn about the advances in the classical 'human-plus-machine' CASP category, the fully automated CAFASP category, and the comparison between the two.


Subject(s)
Models, Molecular , Proteins/chemistry , Amino Acid Sequence , Animals , Electronic Data Processing , Humans , Internet , Protein Structure, Tertiary
20.
Bioinformatics ; 16(9): 776-85, 2000 Sep.
Article in English | MEDLINE | ID: mdl-11108700

ABSTRACT

MOTIVATION: Evaluating the accuracy of predicted models is critical for assessing structure prediction methods. Because this problem is not trivial, a large number of different assessment measures have been proposed by various authors, and it has already become an active subfield of research (Moult et al. (1997,1999) and CAFASP (Fischer et al. 1999) prediction experiments have demonstrated that it has been difficult to choose one single, 'best' method to be used in the evaluation. Consequently, the CASP3 evaluation was carried out using an extensive set of especially developed numerical measures, coupled with human-expert intervention. As part of our efforts towards a higher level of automation in the structure prediction field, here we investigate the suitability of a fully automated, simple, objective, quantitative and reproducible method that can be used in the automatic assessment of models in the upcoming CAFASP2 experiment. Such a method should (a) produce one single number that measures the quality of a predicted model and (b) perform similarly to human-expert evaluations. RESULTS: MaxSub is a new and independently developed method that further builds and extends some of the evaluation methods introduced at CASP3. MaxSub aims at identifying the largest subset of C(alpha) atoms of a model that superimpose 'well' over the experimental structure, and produces a single normalized score that represents the quality of the model. Because there exists no evaluation method for assessment measures of predicted models, it is not easy to evaluate how good our new measure is. Even though an exact comparison of MaxSub and the CASP3 assessment is not straightforward, here we use a test-bed extracted from the CASP3 fold-recognition models. A rough qualitative comparison of the performance of MaxSub vis-a-vis the human-expert assessment carried out at CASP3 shows that there is a good agreement for the more accurate models and for the better predicting groups. As expected, some differences were observed among the medium to poor models and groups. Overall, the top six predicting groups ranked using the fully automated MaxSub are also the top six groups ranked at CASP3. We conclude that MaxSub is a suitable method for the automatic evaluation of models.


Subject(s)
Algorithms , Computational Biology/methods , Computer Simulation , Protein Folding , Software Validation , Amino Acid Sequence , Bacterial Proteins/chemistry , Models, Molecular , Numerical Analysis, Computer-Assisted , Peptide Initiation Factors/chemistry , Predictive Value of Tests , Protein Structure, Tertiary , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...