Your browser doesn't support javascript.
PreDTIs: prediction of drug-target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques.
Mahmud, S M Hasan; Chen, Wenyu; Liu, Yongsheng; Awal, Md Abdul; Ahmed, Kawsar; Rahman, Md Habibur; Moni, Mohammad Ali.
  • Mahmud SMH; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Chen W; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Liu Y; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Awal MA; Electronics and Communication Engineering Discipline, Khulna University, Khulna 9208, Bangladesh.
  • Ahmed K; Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail-1902, Bangladesh.
  • Rahman MH; Department of Computer Science and Engineering, Islamic University, Kushtia-7003, Bangladesh.
  • Moni MA; UNSW Digital Health, WHO Center for eHealth, School of Public Health and Community Medicine, Faculty of Medicine, The University of New South Wales, Sydney, Australia.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: covidwho-1132434
ABSTRACT
Discovering drug-target (protein) interactions (DTIs) is of great significance for researching and developing novel drugs, having a tremendous advantage to pharmaceutical industries and patients. However, the prediction of DTIs using wet-lab experimental methods is generally expensive and time-consuming. Therefore, different machine learning-based methods have been developed for this purpose, but there are still substantial unknown interactions needed to discover. Furthermore, data imbalance and feature dimensionality problems are a critical challenge in drug-target datasets, which can decrease the classifier performances that have not been significantly addressed yet. This paper proposed a novel drug-target interaction prediction method called PreDTIs. First, the feature vectors of the protein sequence are extracted by the pseudo-position-specific scoring matrix (PsePSSM), dipeptide composition (DC) and pseudo amino acid composition (PseAAC); and the drug is encoded with MACCS substructure fingerings. Besides, we propose a FastUS algorithm to handle the class imbalance problem and also develop a MoIFS algorithm to remove the irrelevant and redundant features for getting the best optimal features. Finally, balanced and optimal features are provided to the LightGBM Classifier to identify DTIs, and the 5-fold CV validation test method was applied to evaluate the prediction ability of the proposed method. Prediction results indicate that the proposed model PreDTIs is significantly superior to other existing methods in predicting DTIs, and our model could be used to discover new drugs for unknown disorders or infections, such as for the coronavirus disease 2019 using existing drugs compounds and severe acute respiratory syndrome coronavirus 2 protein sequences.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Pharmaceutical Preparations / Proteins / Computational Biology Type of study: Experimental Studies / Prognostic study Language: English Journal subject: Biology / Medical Informatics Year: 2021 Document Type: Article Affiliation country: Bib

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Pharmaceutical Preparations / Proteins / Computational Biology Type of study: Experimental Studies / Prognostic study Language: English Journal subject: Biology / Medical Informatics Year: 2021 Document Type: Article Affiliation country: Bib