ABSTRACT
Latent amino acid repeats seem to be widespread in genetic sequences and to reflect their structure, function, and evolution. We have recently identified latent periodicity in more than 150 protein families including protein kinases and various nucleotide-binding proteins. The latent repeats in these families were correlated to their structure and evolution. However, a majority of known protein families were not identified with our latent periodicity search algorithm. The main presumable reason for this was the inability of our techniques to identify periodicities interspersed with insertions and deletions. We designed the new latent periodicity search algorithm, which is capable of taking into account insertions and deletions. As a result, we identified many novel cases of latent periodicity peculiar to protein families. Possible origins of the periodic structure of these families are discussed. Summarizing, we presume that latent periodicity is present in a substantial portion of known protein families. The latent periodicity matrices and the results of Swiss-Prot scans are available from http://bioinf.narod.ru/del/.
Subject(s)
Algorithms , Amino Acid Sequence , Proteins/chemistry , Adenosine Triphosphatases/chemistry , Chaperonin 60/chemistry , Endoribonucleases/chemistry , Gene Products, gag/chemistry , Models, Theoretical , Molecular Sequence Data , Nucleotidyltransferases/chemistry , PeriodicityABSTRACT
Here, we have applied information decomposition, cyclic profile alignment, and noise decomposition techniques to search for latent repeats within protein families of various functions. We have identified 94 protein families with a family-specific periodicity. In each case, the periodic element was found in greater than 70% of family members. Latent periodicity profiles with specific length and signature were obtained in each case. The possible relationship between the periodic elements thus identified and the evolutionary development of the protein families are discussed with specific reference to the possibility that there is a correlation between the periodic elements and protein function.
Subject(s)
Computational Biology , Multigene Family , Sequence Analysis, Protein , Algorithms , Amino Acid Sequence , Sequence AlignmentABSTRACT
We identified latent periodicity in catalytic domains of approximately 85% of annotated serine-threonine and tyrosine protein kinases. Similar results were obtained for other 22 protein families and domains. We also designed the method of noise decomposition, which is aimed to distinguish between different periodicity types of the same period length. The method is to be used in conjunction with the method of cyclic profile alignment, and this combination is able to reveal structure-related or function-related patterns of latent periodicity. Possible origins of the periodic structure of protein kinase active sites are discussed. Summarizing, we presume that latent periodicity is the common property of many catalytic protein domains.