Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Mol Cell ; 12(6): 1353-65, 2003 Dec.
Article in English | MEDLINE | ID: mdl-14690591

ABSTRACT

Interpreting genome sequences requires the functional analysis of thousands of predicted proteins, many of which are uncharacterized and without obvious homologs. To assess whether the roles of large sets of uncharacterized genes can be assigned by targeted application of a suite of technologies, we used four complementary protein-based methods to analyze a set of 100 uncharacterized but essential open reading frames (ORFs) of the yeast Saccharomyces cerevisiae. These proteins were subjected to affinity purification and mass spectrometry analysis to identify copurifying proteins, two-hybrid analysis to identify interacting proteins, fluorescence microscopy to localize the proteins, and structure prediction methodology to predict structural domains or identify remote homologies. Integration of the data assigned function to 48 ORFs using at least two of the Gene Ontology (GO) categories of biological process, molecular function, and cellular component; 77 ORFs were annotated by at least one method. This combination of technologies, coupled with annotation using GO, is a powerful approach to classifying genes.


Subject(s)
Computational Biology , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism , Genome, Fungal , Oligonucleotide Array Sequence Analysis , Open Reading Frames , Proteome/analysis , Two-Hybrid System Techniques
2.
J Mol Biol ; 322(1): 65-78, 2002 Sep 06.
Article in English | MEDLINE | ID: mdl-12215415

ABSTRACT

We use the Rosetta de novo structure prediction method to produce three-dimensional structure models for all Pfam-A sequence families with average length under 150 residues and no link to any protein of known structure. To estimate the reliability of the predictions, the method was calibrated on 131 proteins of known structure. For approximately 60% of the proteins one of the top five models was correctly predicted for 50 or more residues, and for approximately 35%, the correct SCOP superfamily was identified in a structure-based search of the Protein Data Bank using one of the models. This performance is consistent with results from the fourth critical assessment of structure prediction (CASP4). Correct and incorrect predictions could be partially distinguished using a confidence function based on a combination of simulation convergence, protein length and the similarity of a given structure prediction to known protein structures. While the limited accuracy and reliability of the method precludes definitive conclusions, the Pfam models provide the only tertiary structure information available for the 12% of publicly available sequences represented by these large protein families.


Subject(s)
Computational Biology/methods , Proteins/chemistry , Proteins/classification , Calibration , Computer Simulation , Databases, Protein , Models, Molecular , Protein Folding , Protein Structure, Tertiary , Proteins/metabolism , Sensitivity and Specificity , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL
...