Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Bioinformatics ; 7: 288, 2006 Jun 07.
Article in English | MEDLINE | ID: mdl-16759376

ABSTRACT

BACKGROUND: In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power. In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. RESULTS: We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. CONCLUSION: This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.


Subject(s)
Algorithms , Pattern Recognition, Automated/methods , Proteome/chemistry , Proteome/classification , Sequence Alignment/methods , Amino Acid Sequence , Artificial Intelligence , Humans , Molecular Sequence Data , Protein Folding , Sequence Analysis, Protein
2.
Nucleic Acids Res ; 32(Database issue): D196-9, 2004 Jan 01.
Article in English | MEDLINE | ID: mdl-14681393

ABSTRACT

Currently, the Genomic Threading Database (GTD) contains structural assignments for the proteins encoded within the genomes of nine eukaryotes and 101 prokaryotes. Structural annotations are carried out using a modified version of GenTHREADER, a reliable fold recognition method. The Gen THREADER annotation jobs are distributed across multiple clusters of processors using grid technology and the predictions are deposited in a relational database accessible via a web interface at http://bioinf.cs.ucl.ac.uk/GTD. Using this system, up to 84% of proteins encoded within a genome can be confidently assigned to known folds with 72% of the residues aligned. On average in the GTD, 64% of proteins encoded within a genome are confidently assigned to known folds and 58% of the residues are aligned to structures.


Subject(s)
Databases, Genetic , Genomics , Proteins/chemistry , Proteins/genetics , Animals , Computational Biology , Genome , Humans , Information Storage and Retrieval , Internet , Protein Folding , Protein Structure, Tertiary , Proteome/chemistry , Proteome/genetics , Proteomics , Software
3.
Bioinformatics ; 20(1): 131-2, 2004 Jan 01.
Article in English | MEDLINE | ID: mdl-14693823

ABSTRACT

UNLABELLED: The Genomic Threading Database currently contains structural annotations for the genomes of over 100 recently sequenced organisms. Annotations are carried out by using our modified GenTHREADER software and through implementing grid technology. AVAILABILITY: http://bioinf.cs.ucl.ac.uk/GTD


Subject(s)
Database Management Systems , Databases, Protein , Documentation , Gene Expression Profiling/methods , Information Storage and Retrieval/methods , Proteome/chemistry , Proteome/genetics , Sequence Analysis, Protein/methods , Genome , Sequence Alignment/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...