Your browser doesn't support javascript.
Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies.
Pang, Yuxuan; Wang, Zhuo; Jhong, Jhih-Hua; Lee, Tzong-Yi.
  • Pang Y; Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, P.R. China.
  • Wang Z; School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, P.R. China.
  • Jhong JH; Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, P.R. China.
  • Lee TY; Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Shenzhen, 518172, P.R. China.
Brief Bioinform ; 22(2): 1085-1095, 2021 03 22.
Article in English | MEDLINE | ID: covidwho-1343658
ABSTRACT
As the current worldwide outbreaks of the SARS-CoV-2, it is urgently needed to develop effective therapeutic agents for inhibiting the pathogens or treating the related diseases. Antimicrobial peptides (AMP) with functional activity against coronavirus could be a considerable solution, yet there is no research for identifying anti-coronavirus (anti-CoV) peptides with the computational approach. In this study, we first investigated the physiochemical and compositional properties of the collected anti-CoV peptides by comparing against three other negative sets antivirus peptides without anti-CoV function (antivirus), regular AMP without antivirus functions (non-AVP) and peptides without antimicrobial functions (non-AMP). Then, we established classifiers for identifying anti-CoV peptides between different negative sets based on random forest. Imbalanced learning strategies were adopted due to the severe class-imbalance within the datasets. The geometric mean of the sensitivity and specificity (GMean) under the identification from antivirus, non-AVP and non-AMP reaches 83.07%, 85.51% and 98.82%, respectively. Then, to pursue identifying anti-CoV peptides from broad-spectrum peptides, we designed a double-stages classifier based on the collected datasets. In the first stage, the classifier characterizes AMPs from regular peptides. It achieves an area under the receiver operating curve (AUCROC) value of 97.31%. The second stage is to identify the anti-CoV peptides between the combined negatives of other AMPs. Here, the GMean of evaluation on the independent test set is 79.42%. The proposed approach is considered as an applicable scheme for assisting the development of novel anti-CoV peptides. The datasets and source codes used in this study are available at https//github.com/poncey/PreAntiCoV.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Antiviral Agents / Antimicrobial Cationic Peptides / SARS-CoV-2 / Learning Type of study: Diagnostic study / Experimental Studies / Prognostic study / Randomized controlled trials Limits: Humans Language: English Journal: Brief Bioinform Journal subject: Biology / Medical Informatics Year: 2021 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Antiviral Agents / Antimicrobial Cationic Peptides / SARS-CoV-2 / Learning Type of study: Diagnostic study / Experimental Studies / Prognostic study / Randomized controlled trials Limits: Humans Language: English Journal: Brief Bioinform Journal subject: Biology / Medical Informatics Year: 2021 Document Type: Article