Search | VHL Regional Portal

A pairwise residue contact area-based mean force potential for discrimination of native protein structure.

Arab, Shahriar; Sadeghi, Mehdi; Eslahchi, Changiz; Pezeshk, Hamid; Sheari, Armita.

BMC Bioinformatics ; 11: 16, 2010 Jan 09.

Article in English | MEDLINE | ID: mdl-20064218

ABSTRACT

BACKGROUND: Considering energy function to detect a correct protein fold from incorrect ones is very important for protein structure prediction and protein folding. Knowledge-based mean force potentials are certainly the most popular type of interaction function for protein threading. They are derived from statistical analyses of interacting groups in experimentally determined protein structures. These potentials are developed at the atom or the amino acid level. Based on orientation dependent contact area, a new type of knowledge-based mean force potential has been developed. RESULTS: We developed a new approach to calculate a knowledge-based potential of mean-force, using pairwise residue contact area. To test the performance of our approach, we performed it on several decoy sets to measure its ability to discriminate native structure from decoys. This potential has been able to distinguish native structures from the decoys in the most cases. Further, the calculated Z-scores were quite high for all protein datasets. CONCLUSIONS: This knowledge-based potential of mean force can be used in protein structure prediction, fold recognition, comparative modelling and molecular recognition. The program is available at http://www.bioinf.cs.ipm.ac.ir/softwares/surfield.

Subject(s)

Computational Biology/methods , Protein Conformation , Proteins/chemistry , Binding Sites , Databases, Protein , Protein Folding

A tale of two symmetrical tails: structural and functional characteristics of palindromes in proteins.

Sheari, Armita; Kargar, Mehdi; Katanforoush, Ali; Arab, Shahriar; Sadeghi, Mehdi; Pezeshk, Hamid; Eslahchi, Changiz; Marashi, Sayed-Amir.

BMC Bioinformatics ; 9: 274, 2008 Jun 11.

Article in English | MEDLINE | ID: mdl-18547401

ABSTRACT

BACKGROUND: It has been previously shown that palindromic sequences are frequently observed in proteins. However, our knowledge about their evolutionary origin and their possible importance is incomplete. RESULTS: In this work, we tried to revisit this relatively neglected phenomenon. Several questions are addressed in this work. (1) It is known that there is a large chance of finding a palindrome in low complexity sequences (i.e. sequences with extreme amino acid usage bias). What is the role of sequence complexity in the evolution of palindromic sequences in proteins? (2) Do palindromes coincide with conserved protein sequences? If yes, what are the functions of these conserved segments? (3) In case of conserved palindromes, is it always the case that the whole conserved pattern is also symmetrical? (4) Do palindromic protein sequences form regular secondary structures? (5) Does sequence similarity of the two "sides" of a palindrome imply structural similarity? For the first question, we showed that the complexity of palindromic peptides is significantly lower than randomly generated palindromes. Therefore, one can say that palindromes occur frequently in low complexity protein segments, without necessarily having a defined function or forming a special structure. Nevertheless, this does not rule out the possibility of finding palindromes which play some roles in protein structure and function. In fact, we found several palindromes that overlap with conserved protein Blocks of different functions. However, in many cases we failed to find any symmetry in the conserved regions of corresponding Blocks. Furthermore, to answer the last two questions, the structural characteristics of palindromes were studied. It is shown that palindromes may have a great propensity to form alpha-helical structures. Finally, we demonstrated that the two sides of a palindrome generally do not show significant structural similarities. CONCLUSION: We suggest that the puzzling abundance of palindromic sequences in proteins is mainly due to their frequent concurrence with low-complexity protein regions, rather than a global role in the protein function. In addition, palindromic sequences show a relatively high tendency to form helices, which might play an important role in the evolution of proteins that contain palindromes. Moreover, reverse similarity in peptides does not necessarily imply significant structural similarity. This observation rules out the importance of palindromes for forming symmetrical structures. Although palindromes frequently overlap with conserved Blocks, we suggest that palindromes overlap with Blocks only by coincidence, rather than being involved with a certain structural fold or protein domain.

Subject(s)

Amino Acid Sequence/physiology , Computational Biology/methods , Proteins/analysis , Amino Acids/analysis , Binding Sites/genetics , Conserved Sequence/physiology , Databases, Protein , Evolution, Molecular , Pattern Recognition, Automated , Protein Structure, Secondary/physiology , Proteins/ultrastructure , Sequence Alignment , Sequence Homology, Amino Acid , Statistics, Nonparametric , Structure-Activity Relationship

The performances of the chi-square test and complexity measures for signal recognition in biological sequences.

Pirhaji, Leila; Kargar, Mehdi; Sheari, Armita; Poormohammadi, Hadi; Sadeghi, Mehdi; Pezeshk, Hamid; Eslahchi, Changiz.

J Theor Biol ; 251(2): 380-7, 2008 Mar 21.

Article in English | MEDLINE | ID: mdl-18177672

ABSTRACT

With large amounts of experimental data, modern molecular biology needs appropriate methods to deal with biological sequences. In this work, we apply a statistical method (Pearson's chi-square test) to recognize the signals appear in the whole genome of the Escherichia coli. To show the effectiveness of the method, we compare the Pearson's chi-square test with linguistic complexity on the complete genome of E. coli. The results suggest that Pearson's chi-square test is an efficient method for distinguishing genes (coding regions) form pseudogenes (noncoding regions). On the other hand, the performance of the linguistic complexity is much lower than the chi-square test method. We also use the Pearson's chi-square test method to determine which parts of the Open Reading Frame (ORF) have significant effect on discriminating genes form pseudogenes. Moreover, different complexity measures and Pearson's chi-square test applied on the genes with high value of Pearson's chi-square statistic. We also compute the measures on homologous of these genes. The results illustrate that there is a region near the start codon with high value of chi-square statistic and low complexity that is conserve between homologous genes.

Subject(s)

Escherichia coli/genetics , Genome, Bacterial , Open Reading Frames , Base Sequence , Chi-Square Distribution , Computational Biology , Conserved Sequence , Molecular Sequence Data , Pseudogenes , Sequence Homology

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL