Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
J Biomol Struct Dyn ; 39(8): 2885-2893, 2021 May.
Article in English | MEDLINE | ID: mdl-32295482

ABSTRACT

Intrinsically disordered proteins are now widely accepted to play crucial roles in biological functions. Identification of signatures of intrinsic disorder is one of the key steps towards building a proper repertoire for their occurrence in proteomes. In this work, systematic computational synthesis of a library of all possible (3368400) dipeptides, tripeptides, tetrapeptides and pentapeptides using the natural 20 amino acids allowed us to identify 36 unique tetrapeptides present exclusively in intrinsically disordered proteins and absent in the complete primary sequence space of naturally occurring structured proteins. Further, out of more than 530000 known naturally occurring primary sequences without any structural information, 1349 sequences contain the above identified unique signatures of intrinsic disorder. These sequences, having cellular functions varying from housekeeping to metabolic to transport, more than double the number of the currently known intrinsically disordered proteins. On similar lines, we report that 26577 pentapeptide signatures exclusive to intrinsically disordered proteins, and absent in naturally occurring structured proteins, identify ∼50% of more than half-a-million curated protein sequences without structural information to be intrinsically disordered. The results reported are a major leap forward in exploring functional manifestations of intrinsically disordered proteins.Communicated by Ramaswamy H. Sarma.


Subject(s)
Intrinsically Disordered Proteins , Amino Acid Sequence , Amino Acids , Peptides , Protein Conformation , Proteome
2.
J Biomol Struct Dyn ; 39(7): 2364-2375, 2021 Apr.
Article in English | MEDLINE | ID: mdl-32238088

ABSTRACT

Rigorous analyses of Euclidean distances between non-peptide bonded residues in structures of several thousand naturally occurring folded proteins yielded a surprising "margin of life" for percentage occurrence of individual amino acids in naturally occurring folded proteins. On one hand, the concept of "margin of life", referring to lower than expected variances in average stoichiometric occurrences of individual amino acids in folded proteins, remains unchallenged since its discovery a decade ago. On the other hand, within this past decade there has been a strong emergence of a gradual paradigm shift in biology, from sequence-structure-function in proteins to sequence-disorder-function, fuelled by discoveries on functional implications of intrinsically disordered proteins (primary sequences that do not form stable structures). Thus the applicability of "margin of life" to peptide-bonded residues in all known natural proteins, adopting stable structures vis-à-vis intrinsically disordered needs to be explored. Therefore in this work, we analyze compositions of the complete naturally occurring primary sequence space (over 560000 sequences) after dividing it into mutually exclusive subsets of structured and intrinsically disordered proteins along with a subset without any structural information. While finding that occurrence of different peptides (up to pentapeptides) is a direct consequence of the relative occurrences of their constituting residues in folded proteins, we report that structural disorder in natural proteins originates beyond the narrow stoichiometric margins of amino acids found in structured proteins.Communicated by Ramaswamy H. Sarma.


Subject(s)
Amino Acids , Intrinsically Disordered Proteins , Protein Conformation , Protein Folding
3.
J Biomol Struct Dyn ; 38(15): 4579-4583, 2020 Sep.
Article in English | MEDLINE | ID: mdl-31625464

ABSTRACT

Number of naturally occurring primary sequences of proteins is an infinitesimally small subset of the possible number of primary sequences that can be synthesized using 20 amino acids. Prevailing views ascribe this to slow and incremental mutational/selection evolutionary mechanisms. However, considering the large number of avenues available in form of diversity of emerging/evolving and/or disappearing living systems for exploring the primary sequence space over the evolutionary time scale of ∼3.5 billion years, this remains a conjecture. Therefore, to investigate primary sequence space limitations, we carried out a systematic study for finding primary sequences absent in nature. We report the discovery of the smallest peptide sequence "Cysteine-Glutamine-Tryptophan-Tryptophan" that is not found in over half-a-million curated protein sequences in the Uniprot (Swiss-Prot) database. Additionally, we report a library of 83605 pentapeptides that are not found in any of the known protein sequences. Compositional analyses of these absent primary sequences yield a remarkably strong power relationship between the percentage occurrence of individual amino acids in all known protein sequences and their respective frequency of occurrence in the absent peptides, regardless of their specific position in the sequences. If random evolutionary mechanisms were responsible for limitations to the primary sequence space, then one would not expect any relationship between compositions of available and absent primary sequences. Thus, we conclusively show that stoichiometric constraints on amino acids limit the primary sequence space of proteins in nature. We discuss the possibly profound implications of our findings in both evolutionary and synthetic biology.Communicated by Ramaswamy H. Sarma.


Subject(s)
Amino Acids , Proteins , Amino Acid Sequence , Databases, Protein , Proteins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...