ABSTRACT
The problem of predicting the three-dimensional structure of a protein starting from its amino acid sequence is regarded as one of the most important open problems in biology. Here, we solve aspects of this problem for the so-called sandwich proteins that constitute a large class of proteins consisting of only beta-strands arranged in two sheets. A breakthrough for this class of proteins was announced in Kister et al. (Kister et al. 2002 Proc. Natl Acad. Sci. USA 99, 14 137-14 141), in which it was shown that sandwich proteins contain a certain invariant substructure called interlock. It was later noted that approximately 90% of the observed sandwich proteins are canonical, namely they are generated by certain geometrical structures. Here, employing a topological investigation, we prove that interlocks and geometrical structures are the direct consequence of certain biologically motivated fundamental principles. Furthermore, we construct all possible canonical motifs involving 6-10 strands. This construction limits dramatically the number of possible motifs. For example, for sandwich proteins with nine strands, the a priori number of possible canonical motifs exceeds 360000, whereas our construction yields only 49 geometrical structures and 625 canonical motifs.
Subject(s)
Amino Acid Motifs , Models, Molecular , Protein Folding , Protein Structure, Secondary , Proteins/chemistry , Biophysical PhenomenaABSTRACT
From a computer analysis of the spatial organization of the secondary structures of beta-sandwich proteins, we find certain sets of consecutive strands that are connected by hydrogen bonds, which we call "strandons." The analysis of the arrangements of strandons in 491 protein structures that come from 69 different superfamilies reveals strict regularities in the arrangements of strandons and the formation of what we call "canonical supermotifs." Six such supermotifs account for approximately 90% of all observed structures. Simple geometric rules are described that dictate the formation of these supermotifs.
Subject(s)
Proteins/chemistry , Amino Acid Motifs , Amino Acid Sequence , Biophysical Phenomena , Biophysics , Hydrogen Bonding , Models, Molecular , Molecular Sequence Data , Plastocyanin/chemistry , Plastocyanin/genetics , Protein Structure, SecondaryABSTRACT
For a large class of proteins called sandwich-like proteins (SPs), the secondary structures consist of two beta-sheets packed face-to-face, with each beta-sheet consisting typically of three to five beta-strands. An important step in the prediction of the three-dimensional structure of a SP is the prediction of its supersecondary structure, namely the prediction of the arrangement of the beta-strands in the two beta-sheets. Recently, significant progress in this direction was made, where it was shown that 91% of observed SPs form what we here call "canonical motifs." Here, we show that all canonical motifs can be constructed in a simple manner that is based on thermodynamic considerations and uses certain geometric structures. The number of these structures is much smaller than the number of possible strand arrangements. For instance, whereas for SPs consisting of six strands there exist a priori 900 possible strand arrangements, there exist only five geometric structures. Furthermore, the few motifs that are noncanonial can be constructed from canonical motifs by a simple procedure.