Your browser doesn't support javascript.
loading
Benchmarking compound activity prediction for real-world drug discovery applications.
Tian, Tingzhong; Li, Shuya; Zhang, Ziting; Chen, Lin; Zou, Ziheng; Zhao, Dan; Zeng, Jianyang.
Affiliation
  • Tian T; Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
  • Li S; Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
  • Zhang Z; Department of Automation, Tsinghua University, Beijing, China.
  • Chen L; MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China.
  • Zou Z; Silexon AI Technology Co., Ltd., Nanjing, Jiangsu Province, China.
  • Zhao D; Silexon AI Technology Co., Ltd., Nanjing, Jiangsu Province, China.
  • Zeng J; Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China. zhaodan2018@tsinghua.edu.cn.
Commun Chem ; 7(1): 127, 2024 Jun 04.
Article in En | MEDLINE | ID: mdl-38834746
ABSTRACT
Identifying active compounds for target proteins is fundamental in early drug discovery. Recently, data-driven computational methods have demonstrated promising potential in predicting compound activities. However, there lacks a well-designed benchmark to comprehensively evaluate these methods from a practical perspective. To fill this gap, we propose a Compound Activity benchmark for Real-world Applications (CARA). Through carefully distinguishing assay types, designing train-test splitting schemes and selecting evaluation metrics, CARA can consider the biased distribution of current real-world compound activity data and avoid overestimation of model performances. We observed that although current models can make successful predictions for certain proportions of assays, their performances varied across different assays. In addition, evaluation of several few-shot training strategies demonstrated different performances related to task types. Overall, we provide a high-quality dataset for developing and evaluating compound activity prediction models, and the analyses in this work may inspire better applications of data-driven models in drug discovery.

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Commun Chem Year: 2024 Document type: Article Affiliation country: China Country of publication: United kingdom

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Commun Chem Year: 2024 Document type: Article Affiliation country: China Country of publication: United kingdom