Search | VHL Regional Portal

Constrained Layer Assignment for the Protein Burial Folding Code Accounting for Chain Connectivity.

van der Linden, Marx G; Ferreira, Diogo C; Pereira de Araújo, Antônio F.

J Phys Chem B ; 126(33): 6159-6170, 2022 08 25.

Article in English | MEDLINE | ID: mdl-35952378

ABSTRACT

The connection between protein sequences and tertiary structures has intrigued investigators for decades. A plausible hypothesis for the coding scheme postulates that atomic burial information obtainable from the sequence could be sufficient for structural determination when combined to sequence-independent constraints. Accordingly, folding simulations using native burial information expressed by atomic central distances, discretized into a small number L of equiprobable burial layers, have indeed been successful in reaching and distinguishing the native structure of several globular proteins. Attempted predictions of layers from sequence, however, turned out to be insufficiently accurate for most proteins. Here we explore the possibility that a nonuniform assignment of layers, which is intended to account for constraints imposed by chain connectivity, might provide a more efficient burial encoding of tertiary structures. We consider the condition that adjacent Cα-atoms along the sequence cannot occupy nonadjacent layers, in which case the information required to specify sequences of burials would be smaller. It is shown that appropriate folding behavior can still be observed in this explicitly more constrained scenario with a structure-dependent assignment intended to produce the thinnest possible layers still compatible with the imposed burial constraint. This thinnest assignment turns out to be sufficiently restrictive for the observed examples and provides appropriately thinner layers or, equivalently, a larger number of layers, for examples previously observed to indeed require more restrictive constraints when compared to counterparts of similar size, as well as the appropriate increase in number of layers for larger proteins. Implications for the general understanding of the protein folding code are discussed.

Subject(s)

Protein Folding , Proteins , Amino Acid Sequence , Burial , Models, Molecular , Protein Conformation , Proteins/chemistry

Information and redundancy in the burial folding code of globular proteins within a wide range of shapes and sizes.

Ferreira, Diogo C; van der Linden, Marx G; de Oliveira, Leandro C; Onuchic, José N; de Araújo, Antônio F Pereira.

Proteins ; 84(4): 515-31, 2016 Apr.

Article in English | MEDLINE | ID: mdl-26815167

ABSTRACT

Recent ab initio folding simulations for a limited number of small proteins have corroborated a previous suggestion that atomic burial information obtainable from sequence could be sufficient for tertiary structure determination when combined to sequence-independent geometrical constraints. Here, we use simulations parameterized by native burials to investigate the required amount of information in a diverse set of globular proteins comprising different structural classes and a wide size range. Burial information is provided by a potential term pushing each atom towards one among a small number L of equiprobable concentric layers. An upper bound for the required information is provided by the minimal number of layers L(min) still compatible with correct folding behavior. We obtain L(min) between 3 and 5 for seven small to medium proteins with 50 ≤ Nr ≤ 110 residues while for a larger protein with Nr = 141 we find that L ≥ 6 is required to maintain native stability. We additionally estimate the usable redundancy for a given L ≥ L(min) from the burial entropy associated to the largest folding-compatible fraction of "superfluous" atoms, for which the burial term can be turned off or target layers can be chosen randomly. The estimated redundancy for small proteins with L = 4 is close to 0.8. Our results are consistent with the above-average quality of burial predictions used in previous simulations and indicate that the fraction of approachable proteins could increase significantly with even a mild, plausible, improvement on sequence-dependent burial prediction or on sequence-independent constraints that augment the detectable redundancy during simulations.

Subject(s)

Algorithms , Models, Molecular , Proteins/chemistry , Amino Acid Sequence , Computer Simulation , Monte Carlo Method , Protein Folding , Protein Structure, Tertiary , Thermodynamics

Information-theoretic analysis and prediction of protein atomic burials: on the search for an informational intermediate between sequence and structure.

Rocha, Juliana R; van der Linden, Marx G; Ferreira, Diogo C; Azevêdo, Paulo H; Pereira de Araújo, Antônio F.

Bioinformatics ; 28(21): 2755-62, 2012 Nov 01.

Article in English | MEDLINE | ID: mdl-22923297

ABSTRACT

MOTIVATION: It has been recently suggested that atomic burials, as expressed by molecular central distances, contain sufficient information to determine the tertiary structure of small globular proteins. A possible approach to structural determination from sequence could therefore involve a sequence-to-burial intermediate prediction step whose accuracy, however, is theoretically limited by the mutual information between these two variables. We use a non-redundant set of globular protein structures to estimate the mutual information between local amino acid sequence and atomic burials. Discretizing central distances of or atoms in equiprobable burial levels, we estimate relevant mutual information measures that are compared with actual predictions obtained from a Naive Bayesian Classifier (NBC) and a Hidden Markov Model (HMM). RESULTS: Mutual information density for 20 amino acids and two or three burial levels were estimated to be roughly 15% of the unconditional burial entropy density. Lower estimates for the mutual information between local amino acid sequence and burial of a single residue indicated an increase in mutual information with the number of burial levels up to at least five or six levels. Prediction schemes were found to efficiently extract the available burial information from local sequence. Lower estimates for the mutual information involving single burials are consistently approached by predictions from the NBC and actually surpassed by predictions from the HMM. Near-optimal prediction for the HMM is indicated by the agreement between its density of prediction information and the corresponding density of mutual information between input and output representations. AVAILABILITY: The dataset of protein structures and the prediction implementations are available at http://www.btc.unb.br/ (in 'Software').

Subject(s)

Models, Molecular , Models, Statistical , Proteins/chemistry , Algorithms , Amino Acid Sequence , Bayes Theorem , Entropy , Markov Chains , Protein Structure, Tertiary , Software

Thermo-search: lifestyle and thermostability analysis.

Farias, Savio T; van der Linden, Marx G; Rêgo, Thais G; Araújo, Demétrius A M; Bonato, Maria Christina M.

In Silico Biol ; 4(3): 377-80, 2004.

Article in English | MEDLINE | ID: mdl-15724287

ABSTRACT

Thermo-search is an online web tool for the analysis of proteomes and individual proteins according to the ratio of two couplets of preferred and avoided amino acids in hyperthermophiles, thermophiles and mesophiles. It displays the ratio between glutamic acid plus lysine (E+K) and glutamine plus histidine (Q+H), which is higher in thermophilic proteomes and thermostable proteins than in mesophilic proteomes and thermo labile proteins. Thermo-search allows a rapid screen of the CRM database for thermostable proteins in their functional categories and a visualization of the (E+K)/(Q+H) average ratio between organisms, allowing a comparison of their lifestyles.

Subject(s)

Amino Acids/chemistry , Proteome , Temperature

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL