Sequence-dependent and -independent information in a combined random energy model for protein folding and coding.

Pereira de Araújo, Antônio F

Pereira de Araújo, Antônio F.

Affiliation

Pereira de Araújo AF; Laboratório de Biofísica Teórica, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil.

Proteins ; 92(5): 679-687, 2024 May.

Article in En | MEDLINE | ID: mdl-38158239

ABSTRACT

ABSTRACT

Random energy models (REMs) provide a simple description of the energy landscapes that guide protein folding and evolution. The requirement of a large energy gap between the native structure and unfolded conformations, considered necessary for cooperative, protein-like, folding behavior, indicates that proteins differ markedly from random heteropolymers. It has been suggested, therefore, that natural selection might have acted to choose nonrandom amino acid sequences satisfying this particular condition, implying that a large fraction of possible, unselected random sequences, would not fold to any structure. From an informational perspective, however, this scenario could indicate that protein structures, regarded as messages to be transmitted through a communication channel, would not be efficiently encoded in amino acid sequences, regarded as the communication channel for this transmission, since a large fraction of possible channel states would not be used. Here, we use a combined REM for conformations and sequences, with previously estimated parameters for natural proteins, to explore an alternative possibility in which the appropriate shape of the landscape results mainly from the deviation from randomness of possible native structures instead of sequences. We observe that this situation emerges naturally if the distribution of conformational energies happens to arise from two independent contributions corresponding to sequence-dependent and -independent terms. This construction is consistent with the hypothesis of a protein burial folding code, with native structures being determined by a modest amount of sequence-dependent atomic burial information with sequence-independent constraints imposed by unspecific hydrogen bond formation. More generally, an appropriate combination of sequence-dependent and -independent information accommodates the possibility of an efficient structural encoding with the main physical requirement for folding, providing possible insight not only on the folding process but also on several aspects sequence evolution such as neutral networks, conformational coverage, and de novo gene emergence.

Subject(s)

Protein Folding; Proteins; Protein Conformation; Thermodynamics; Models, Molecular; Proteins/genetics; Proteins/chemistry

Key words

burial folding code; protein folding; random energy model; sequence information

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Proteins / Protein Folding Language: En Journal: Proteins Journal subject: BIOQUIMICA Year: 2024 Document type: Article Affiliation country: Brazil Country of publication: United States

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google