Search | VHL Regional Portal

C-iSUMO: A sumoylation site predictor that incorporates intrinsic characteristics of amino acid sequences.

López, Yosvany; Dehzangi, Abdollah; Reddy, Hamendra Manhar; Sharma, Alok.

Comput Biol Chem ; 87: 107235, 2020 Feb 19.

Article in English | MEDLINE | ID: mdl-32604027

ABSTRACT

Post-translational modifications are considered important molecular interactions in protein science. One of these modifications is "sumoylation" whose computational detection has recently become a challenge. In this paper, we propose a new computational predictor which makes use of the sine and cosine of backbone torsion angles and the accessible surface area for predicting sumoylation sites. The aforementioned features were computed for all the proteins in our benchmark dataset, and a training matrix consisting of sumoylation and non-sumoylation sites was ultimately created. This training matrix was balanced by undersampling the majority class (non-sumoylation sites) using the NearMiss method. Finally, an AdaBoost classifier was used for discriminating between sumoylation and non-sumoylation sites. Our predictor was called "C-iSumo" because of its effective use of circular functions. C-iSumo was compared with another predictor which was outperformed in statistical metrics such as sensitivity (0.734), accuracy (0.746) and Matthews correlation coefficient (0.494).

GlyStruct: glycation prediction using structural properties of amino acid residues.

Reddy, Hamendra Manhar; Sharma, Alok; Dehzangi, Abdollah; Shigemizu, Daichi; Chandra, Abel Avitesh; Tsunoda, Tatushiko.

BMC Bioinformatics ; 19(Suppl 13): 547, 2019 Feb 04.

Article in English | MEDLINE | ID: mdl-30717650

ABSTRACT

BACKGROUND: Glycation is a one of the post-translational modifications (PTM) where sugar molecules and residues in protein sequences are covalently bonded. It has become one of the clinically important PTM in recent times attributed to many chronic and age related complications. Being a non-enzymatic reaction, it is a great challenge when it comes to its prediction due to the lack of significant bias in the sequence motifs. RESULTS: We developed a classifier, GlyStruct based on support vector machine, to predict glycated and non-glycated lysine residues using structural properties of amino acid residues. The features used were secondary structure, accessible surface area and the local backbone torsion angles. For this work, a benchmark dataset was extracted containing 235 glycated and 303 non-glycated lysine residues. GlyStruct demonstrated improved performance of approximately 10% in comparison to benchmark method of Gly-PseAAC. The performance for GlyStruct on the metrics, sensitivity, specificity, accuracy and Mathew's correlation coefficient were 0.7013, 0.7989, 0.7562, and 0.5065, respectively for 10-fold cross-validation. CONCLUSION: Glycation has emerged to be one of the clinically important PTM of proteins in recent times. Therefore, the development of computational tools become necessary to predict glycation, which could help medical professionals administer drugs and manage patients more effectively. The proposed predictor manages to classify glycated and non-glycated lysine residues with promising results consistently on various cross-validation schemes and outperforms other state of the art methods.

Subject(s)

Algorithms , Amino Acids/chemistry , Computational Biology/methods , Amino Acid Sequence , Area Under Curve , Benchmarking , Glycosylation , Humans , Peptides/chemistry , Support Vector Machine

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL