Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
1.
J Chem Inf Model ; 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38950192

ABSTRACT

Scaffold-hopped (SH) compounds are bioactive compounds structurally different from known active compounds. Identifying SH compounds in the ligand-based approaches has been a central issue in medicinal chemistry, and various molecular representations of scaffold hopping have been proposed. However, appropriate representations for SH compound identification remain unclear. Herein, the ability of SH compound identification among several representations was fairly evaluated based on retrospective validation and prospective demonstration. In the retrospective validation, the combinations of two screening algorithms and four two- and three-dimensional molecular representations were compared using controlled data sets for the early identification of SH compounds. We found that the combination of the support vector machine and extended connectivity fingerprint with bond diameter 4 (SVM-ECFP4) and SVM and the rapid overlay of chemical structures (SVM-ROCS) showed a relatively high performance. The compounds that were highly ranked by SVM-ROCS did not share substructures with the active training compounds, while those ranked by SVM-ECFP4 were mostly recombinant. In the prospective demonstration, 93 SH compounds were prepared by screening the Namiki database using SVM-ROCS, targeting ABL1 inhibitors. The primary screening using surface plasmon resonance suggested five active compounds; however, in the competitive binding assays with adenosine triphosphate, no hits were found.

2.
ACS Omega ; 9(8): 9463-9474, 2024 Feb 27.
Article in English | MEDLINE | ID: mdl-38434845

ABSTRACT

In the pursuit of optimal quantitative structure-activity relationship (QSAR) models, two key factors are paramount: the robustness of predictive ability and the interpretability of the model. Symbolic regression (SR) searches for the mathematical expressions that explain a training data set. Thus, the models provided by SR are globally interpretable. We previously proposed an SR method that can generate interpretable expressions by humans. This study introduces an enhanced symbolic regression method, termed filter-induced genetic programming 2 (FIGP2), as an extension of our previously proposed SR method. FIGP2 is designed to improve the generalizability of SR models and to be applicable to data sets in which cost-intensive descriptors are employed. The FIGP2 method incorporates two major improvements: a modified domain filter to eradicate diverging expressions based on optimal calculation and the introduction of a stability metric to penalize expressions that would lead to overfitting. Our retrospective comparative analysis using 12 structure-activity relationship data sets revealed that FIGP2 surpassed the previously proposed SR method and conventional modeling methods, such as support vector regression and multivariate linear regression in terms of predictive performance. Generated mathematical expressions by FIGP2 were relatively simple and not divergent in the domain of function. Taken together, FIGP2 can be used for making interpretable regression models with predictive ability.

3.
ACS Omega ; 8(30): 27458-27466, 2023 Aug 01.
Article in English | MEDLINE | ID: mdl-37546629

ABSTRACT

During data-driven process condition optimization on a laboratory scale, only a small-size data set is accessible and should be effectively utilized. On the other hand, during process development, new operations are frequently inserted or current operations are modified. These accessible data sets are somewhat related but not exactly the same type. In this study, we focus on the prediction of the quality of the interface between an insulator and GaN as a semiconductor for the potential application of GaN power semiconductor devices. The quality of the interface was represented as the interface state density, Dit, and the inserted operation to the process was the ultraviolet (UV)/O3-gas treatment. Our retrospective evaluation of model-building approaches for Dit prediction from a process condition revealed that for the UV/O3-treated interfaces, data of interfaces without the treatment contributed to performance improvement. Such performance improvement was not observed when using a data set of Si as the semiconductor. As a modeling method, the automatic relevance vector-based Gaussian process regression with the prior distribution of the length-scale parameters exhibited a relatively high predictive performance and represented a reasonable uncertainty of prediction as reflected by the distance to the training data set. This feature is a prerequisite for a potential application of Bayesian optimization. Furthermore, hyperparameters in the prior distribution of the length-scales could be optimized by leave-one-out cross-validation.

4.
ACS Omega ; 8(22): 19781-19788, 2023 Jun 06.
Article in English | MEDLINE | ID: mdl-37305275

ABSTRACT

Fourier-transform infrared (FTIR) spectroscopy can detect the presence of functional groups and molecules directly from a mixed solution of organic molecules. Although it is quite useful to monitor chemical reactions, quantitative analysis of FTIR spectra becomes difficult when various peaks of different widths overlap. To overcome this difficulty, we propose a chemometrics approach to accurately predict the concentration of components in chemical reactions, yet interpretable by humans. The proposed method first decomposes a spectrum into peaks with various widths by the wavelet transform. Subsequently, a sparse linear regression model is built using the wavelet coefficients. Models by the method are interpretable using the regression coefficients shown on Gaussian distributions with various widths. The interpretation is expected to reveal the relation of broad regions in spectra to the model prediction. In this study, we conducted the prediction of monomer concentration in copolymerization reactions of five monomers against methyl methacrylate by various chemometric approaches including conventional methods. A rigorous validation scheme revealed that the proposed method overall showed better predictive ability than various linear and non-linear regression methods. The visualization results were consistent with the interpretation obtained by another chemometric approach and qualitative evaluation. The proposed method is found to be useful for calculating the concentrations of monomers in copolymerization reactions and for the interpretation of spectra.

5.
Clin Exp Med ; 23(7): 3407-3416, 2023 Nov.
Article in English | MEDLINE | ID: mdl-36611087

ABSTRACT

To clarify the differences and similarities in the cytokine profiles of macrophage activating syndrome (MAS) between systemic lupus erythematosus (SLE) and adult-onset Still's disease (AOSD). The study participants included 9 patients with MAS-SLE, 22 with non-MAS-SLE, 9 with MAS-AOSD, and 13 with non-MAS-AOSD. Serum cytokine levels were measured using a multiplex bead assay. Cytokine levels were compared between patients with SLE and AOSD with/without MAS. Moreover, cytokine patterns were examined using principal component analysis (PCA) and cluster analysis. IL-6, IL-8, IL-18, and TNF-α levels were elevated in patients with SLE and AOSD. IFN-α levels were elevated in SLE, whereas IL-1ß and IL-18 levels were elevated in AOSD. In SLE, IFN-α and IL-10 levels were higher in MAS than in non-MAS and controls. PCA revealed distinctive cytokine patterns in SLE and AOSD, SLE with IFN-α and IP-10, AOSD with IL-1ß, IL-6, and IL-18, and enhanced cytokine production in MAS. PCA and cluster analysis showed no differences in cytokine patterns between the MAS and non-MAS groups. However, serum ferritin levels were correlated with IFN-α levels in SLE. Cytokine profiles differed between SLE and AOSD but not between MAS and non-MAS. MAS is induced by the enhancement of underlying cytokine abnormalities rather than by MAS-specific cytokine profiles. Type I IFN may be involved in MAS development in patients with SLE.


Subject(s)
Lupus Erythematosus, Systemic , Macrophage Activation Syndrome , Still's Disease, Adult-Onset , Adult , Humans , Interleukin-18 , Macrophage Activation Syndrome/diagnosis , Interleukin-6 , Cytokines , Lupus Erythematosus, Systemic/complications
6.
J Cheminform ; 15(1): 4, 2023 Jan 07.
Article in English | MEDLINE | ID: mdl-36611204

ABSTRACT

Activity cliffs (AC) are formed by pairs of structural analogues that are active against the same target but have a large difference in potency. While much of our knowledge about ACs has originated from the analysis and comparison of compounds and activity data, several studies have reported AC predictions over the past decade. Different from typical compound classification tasks, AC predictions must be carried out at the level of compound pairs representing ACs or nonACs. Most AC predictions reported so far have focused on individual methods or comparisons of two or three approaches and only investigated a few compound activity classes (from 2 to 10). Although promising prediction accuracy has been reported in most cases, different system set-ups, AC definitions, methods, and calculation conditions were used, precluding direct comparisons of these studies. Therefore, we have carried out a large-scale AC prediction campaign across 100 activity classes comparing machine learning methods of greatly varying complexity, ranging from pair-based nearest neighbor classifiers and decision tree or kernel methods to deep neural networks. The results of our systematic predictions revealed the level of accuracy that can be expected for AC predictions across many different compound classes. In addition, prediction accuracy did not scale with methodological complexity but was significantly influenced by memorization of compounds shared by different ACs or nonACs. In many instances, limited training data were sufficient for building accurate models using different methods and there was no detectable advantage of deep learning over simpler approaches for AC prediction. On a global scale, support vector machine models performed best, by only small margins compared to others including simple nearest neighbor classifiers.

7.
ACS Pharmacol Transl Sci ; 6(1): 139-150, 2023 Jan 13.
Article in English | MEDLINE | ID: mdl-36654744

ABSTRACT

Influenza is a respiratory infection caused by the influenza virus that is prevalent worldwide. One of the most contagious variants of influenza is influenza A virus (IAV), which usually spreads in closed spaces through aerosols. Preventive measures such as novel compounds are needed that can act on viral membranes and provide a safe environment against IAV infection. In this study, we screened compounds with common fragrances that are generally used to mask unpleasant odors but can also exhibit antiviral activity against a strain of IAV. Initially, a set of 188 structurally diverse odorants were collected, and their antiviral activity was measured in vapor phase against the IAV solution. Regression models were built for the prediction of antiviral activity using this set of odorants by taking into account their structural features along with vapor pressure and partition coefficient (n-octanol/water). The models were interpreted using a feature weighting approach and Shapley Additive exPlanations to rationalize the predictions as an additional validation for virtual screening. This model was used to screen odorants from an in-house odorant data set consisting of 2020 odorants, which were later evaluated using in vitro experiments. Out of 11 odorants proposed using the final model, 8 odorants were found to exhibit antiviral activity. The feature interpretation of screened odorants suggested that they contained hydrophilic substructures, such as hydroxyl group, which might contribute to denaturation of proteins on the surface of the virus. These odorants should be explored as a preventive measure in closed spaces to decrease the risk of infections of IAV.

8.
Respir Investig ; 61(1): 27-39, 2023 Jan.
Article in English | MEDLINE | ID: mdl-36207238

ABSTRACT

BACKGROUND: As a first step in identifying the developmental pathways of pulmonary abnormalities in rheumatoid arthritis (RA), we sought to determine the existing and changing patterns of pulmonary abnormalities. METHODS: We conducted a retrospective cohort study of consecutive patients with RA who underwent high-resolution computed tomography before and during biologic therapy. The presence of 20 pulmonary abnormalities and the changes in those abnormalities were recorded. Patterns of pre-existing and changing abnormalities were examined via cluster analysis, and their relationship was also assessed using the Kaplan-Meier method and log-rank test. RESULTS: A total of 208 subjects were included. Pulmonary abnormalities were observed in 70% of patients: 39% had interstitial lung disease, and 55% had airway disease (AD). Several different pulmonary abnormalities were commonly found to co-exist in several patterns in the same patient. In most patients with pulmonary abnormalities, AD was present alone or in combination with other abnormalities. During the observation period (mean 3.2 years), 172 pulmonary abnormalities had changed in 91 patients: 115 pulmonary abnormalities newly emerged, whereas 42 worsened and 25 demonstrated improvement. Pulmonary abnormalities changed in several patterns. Correlations were observed between pre-existing and new/worsening abnormalities at individual and regional levels, such as new ground-glass opacity (GGO) and pre-existing AD, small nodular patterns, and honeycombing. AD was a possible initial abnormality. CONCLUSIONS: Pulmonary abnormalities occurred and changed in several patterns, which suggests the existence of developmental pathways of pulmonary abnormalities. AD may play an important role in the development of these abnormalities, including GGO.


Subject(s)
Arthritis, Rheumatoid , Lung Diseases, Interstitial , Humans , Retrospective Studies , Lung Diseases, Interstitial/diagnostic imaging , Lung Diseases, Interstitial/epidemiology , Lung Diseases, Interstitial/etiology , Arthritis, Rheumatoid/complications , Arthritis, Rheumatoid/diagnostic imaging , Arthritis, Rheumatoid/epidemiology , Lung/diagnostic imaging , Tomography, X-Ray Computed/methods
9.
ACS Omega ; 7(30): 26952-26964, 2022 Aug 02.
Article in English | MEDLINE | ID: mdl-35936487

ABSTRACT

Predicting the outcomes of organic reactions using data-driven approaches aids in the acceleration of research. In laboratory-scale experiments, only a small number of reaction data can be accessed for machine learning model construction, where reaction representations play a pivotal role in the success of model construction. Nevertheless, representation comparison for a small data set is not adequate. Herein, focusing on the enantioselectivity of phosphoric-acid-catalyzed reactions, various two-dimensional and three-dimensional reaction representations (descriptors) were compared. Overall, the concatenated form of the extended connectivity fingerprints showed the best predictive capability for the two types of data sets: high-throughput experimental data and manually collected literature data sets. Furthermore, highlighting the substructure contribution to the prediction outcome was shown to be informative for guiding catalyst development.

10.
Mater Today Bio ; 15: 100332, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35795137

ABSTRACT

In vivo blood vessels imaging is crucial to study blood vessels related diseases in real-time. For this purpose, fluorescent based imaging is one of the utmost techniques for imaging a living system. The discovery of a new near-infrared probe (CyA-B2) by screening chemical probe library in our previous report which showed the most specific binding on the blood capillaries of the 3D-tissue models give us interest to study more about the binding site of this probe to the surface of endothelial cells main component cell of blood capillaries. By studying the competition assays of CyA-B2 using several potential surface markers of endothelial cells found through the chemical database (ChEMBL) and manually selected, CD133 gave the lowest IC50 (half maximal inhibitory concentration) value. Hence, CD133 protein which is expressed on the endothelial cell membrane was postulated to be the binding site due to the suppression of CyA-B2 on the blood capillaries by the competition assays. Since, CD133 is also expressed on many types of cancer cells, it would be useful to use CyA-B2 as a bioprobe to monitor or diagnostic tumor growth.

11.
ACS Omega ; 7(16): 14057-14068, 2022 Apr 26.
Article in English | MEDLINE | ID: mdl-35559135

ABSTRACT

A topological pharmacophore (TP) is a chemical graph-based pharmacophore representation, where nodes are pharmacophoric features (PF) and edges are topological distances between PFs. Previously proposed sparse pharmacophore graphs (SPhGs) for TPs were shown to be effective in identifying structurally different active compounds while maintaining the interpretability of the graphs. However, one limitation of using SPhGs as queries is that many structurally similar SPhGs can be identified from a set of active compounds, requiring the classification and visualization of SPhGs, followed by an understanding of the pharmacophore hypotheses. In this study, we propose a scheme for SPhG analysis based on dimensionality reduction techniques with the graph edit distance (GED) metric. This metric enables measuring similarities among SPhGs in a quantitative manner. The visualization of SPhGs, which themselves are the graphs shared by active compounds, can help us understand the pharmacophore hypotheses as well as the data set. As a proof-of-concept study, we generated two-dimensional SPhG-maps using three dimensionality reduction techniques for six biological targets. A comparison with other pharmacophore representations was also conducted. We demonstrated knowledge extraction (interpretation of the data set) from the generated maps. Our findings include a suitable mapping algorithm as well as a pharmacophore hypothesis analysis procedure using an SPhG-map.

12.
J Comput Aided Mol Des ; 36(3): 237-252, 2022 03.
Article in English | MEDLINE | ID: mdl-35348984

ABSTRACT

The retrospective evaluation of virtual screening approaches and activity prediction models are important for methodological development. However, for fair comparison, evaluation data sets must be carefully prepared. In this research, we compiled structure-activity-relationship matrix-based data sets for 15 biological targets along with many diverse inactive compounds, assuming the early stage of structure-activity-relationship progression. To use a large number of diverse inactive compounds and a limited number of active compounds, similarity profiles (SPs) are proposed as a set of molecular descriptors. Using these highly imbalanced data sets, we evaluated various approaches including SPs, under-sampling, support vector machine (SVM), and message passing neural networks. We found that for the under-sampling approaches, cluster-based sampling is better than random sampling. For virtual screening, SPs with inactive reference compounds and the under-sampling SVM also perform well. For classification, SPs with many inactive references performed as well as the under-sampling SVM trained on a balanced data set. Although the performance of SPs and the under-sampling SVM were comparable, SPs with many inactive references were preferable for selecting structurally distinct compounds from the active training compounds.


Subject(s)
Support Vector Machine , Ligands , Retrospective Studies , Structure-Activity Relationship
13.
Mol Inform ; 41(2): e2100156, 2022 02.
Article in English | MEDLINE | ID: mdl-34585854

ABSTRACT

Chemical reaction yield is one of the most important factors for determining reaction conditions. Recently, several machine learning-based prediction models using high-throughput experiment (HTE) data sets were reported for the prediction of reaction yield. However, none of them were at a practical level in terms of predictive ability. In this study, we propose a message passing neural network (MPNN) model for chemical yield prediction, focusing on the Buchwald-Hartwig cross-coupling HTE data set. As an initial atom embedding in MPNN model, we propose to use the Mol2Vec feature vectors pre-trained using a large compound database. Predictive ability of the proposed model was higher than that of previously reported five models for the three out of five data sets. Moreover, visualization of important atoms based on self-attention mechanism was in favor of Mol2Vec as an atom embedding rather than other embeddings including previously employed simple representations.


Subject(s)
Deep Learning , Databases, Factual , Machine Learning , Neural Networks, Computer
14.
Molecules ; 26(16)2021 Aug 13.
Article in English | MEDLINE | ID: mdl-34443503

ABSTRACT

Activity cliffs (ACs) are formed by two structurally similar compounds with a large difference in potency. Accurate AC prediction is expected to help researchers' decisions in the early stages of drug discovery. Previously, predictive models based on matched molecular pair (MMP) cliffs have been proposed. However, the proposed methods face a challenge of interpretability due to the black-box character of the predictive models. In this study, we developed interpretable MMP fingerprints and modified a model-specific interpretation approach for models based on a support vector machine (SVM) and MMP kernel. We compared important features highlighted by this SVM-based interpretation approach and the SHapley Additive exPlanations (SHAP) as a major model-independent approach. The model-specific approach could capture the difference between AC and non-AC, while SHAP assigned high weights to the features not present in the test instances. For specific MMPs, the feature weights mapped by the SVM-based interpretation method were in agreement with the previously confirmed binding knowledge from X-ray co-crystal structures, indicating that this method is able to interpret the AC prediction model in a chemically intuitive manner.

15.
J Chem Inf Model ; 61(7): 3348-3360, 2021 07 26.
Article in English | MEDLINE | ID: mdl-34264667

ABSTRACT

The aim of scaffold hopping (SH) is to find compounds consisting of different scaffolds from those in already known active compounds, giving an opportunity for unexplored regions of chemical space. We previously demonstrated the usefulness of pharmacophore graphs (PhGs) for this purpose through proof-of-concept virtual screening experiments. PhGs consist of nodes and edges corresponding to pharmacophoric features (PFs) and their topological distances. Although PhGs were effective in SH, they are hard to interpret as they are complete graphs. Herein, we introduce an intuitive representation of a molecule, termed as sparse pharmacophore graphs (SPhG) by keeping the topological distances among PFs as much as possible while reducing the number of edges in the graphs. Several benchmark calculations quantitatively confirmed the sparseness of the graphs and the preservation of topological distances among pharmacophoric points. As proof-of-concept applications, virtual screening (VS) trials for SH were conducted using active and inactive compounds from ChEMBL and PubChem databases for three biological targets: thrombin, tyrosine kinase ABL1, and κ-opioid receptor. The performances of VS were comparable with using fully connected PhGs. Furthermore, highly ranked SPhGs were interpretable for the three biological targets, in particular for thrombin, for which selected SPhGs were in agreement with the structure-based interpretation.


Subject(s)
Drug Design , Receptors, Drug
16.
ACS Omega ; 6(18): 11964-11973, 2021 May 11.
Article in English | MEDLINE | ID: mdl-34056351

ABSTRACT

In ligand-based drug design, quantitative structure-activity relationship (QSAR) models play an important role in activity prediction. One of the major end points of QSAR models is half-maximal inhibitory concentration (IC50). Experimental IC50 data from various research groups have been accumulated in publicly accessible databases, providing an opportunity for us to use such data in predictive QSAR models. In this study, we focused on using a ranking-oriented QSAR model as a predictive model because relative potency strength within the same assay is solid information that is not based on any mechanical assumptions. We conducted rigorous validation using the ChEMBL database and previously reported data sets. Ranking support vector machine (ranking-SVM) models trained on compounds from similar assays were as good as support vector regression (SVR) with the Tanimoto kernel trained on compounds from all the assays. As effective ways of data integration, for ranking-SVM, integrated compounds should be selected from only similar assays in terms of compounds. For SVR with the Tanimoto kernel, entire compounds from different assays can be incorporated.

17.
J Comput Aided Mol Des ; 35(2): 179-193, 2021 02.
Article in English | MEDLINE | ID: mdl-33392949

ABSTRACT

Quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) models predict biological activity and molecular property based on the numerical relationship between chemical structures and activity (property) values. Molecular representations are of importance in QSAR/QSPR analysis. Topological information of molecular structures is usually utilized (2D representations) for this purpose. However, conformational information seems important because molecules are in the three-dimensional space. As a three-dimensional molecular representation applicable to diverse compounds, similarity between a test molecule and a set of reference molecules has been previously proposed. This 3D representation was found to be effective on virtual screening for early enrichment of active compounds. In this study, we introduced the 3D representation into QSAR/QSPR modeling (regression tasks). Furthermore, we investigated relative merits of 3D representations over 2D in terms of the diversity of training data sets. For the prediction task of quantum mechanics-based properties, the 3D representations were superior to 2D. For predicting activity of small molecules against specific biological targets, no consistent trend was observed in the difference of performance using the two types of representations, irrespective of the diversity of training data sets.


Subject(s)
Organic Chemicals/chemistry , Databases, Factual , Drug Evaluation, Preclinical , Machine Learning , Models, Molecular , Molecular Conformation , Quantitative Structure-Activity Relationship , Regression Analysis
18.
Mol Inform ; 39(12): e2000103, 2020 12.
Article in English | MEDLINE | ID: mdl-32830451

ABSTRACT

Activity cliffs (ACs) are formed by pairs of structurally similar compounds with large differences in potency. Predicting ACs is of high interest in lead optimization for drug discovery. Previous AC prediction models that focused on matched molecular pair (MMP) cliffs produced adequate performances. However, the extrapolation ability of these models is unclear because the main scaffold for MMPs, the core structure, could exist in both training and test data sets. Also, representation of MMPs did not consider the attachment points where the core and R-group substituents are connected. In this study, we aimed to improve a ligand-based AC prediction method using molecular fingerprints. We incorporated applicability domain, which was defined using R-path fingerprints to consider the local environment around an attachment point. Rigorous evaluation of the extrapolation ability of AC prediction models showed that MMP-cliffs were accurately predicted for nine biological targets. Furthermore, incorporation of training MMPs with cores distinct from those of test MMPs improved the predictability compared with using training MMPs with only similar cores.


Subject(s)
Models, Chemical , Databases, Chemical , Ligands
19.
J Chem Inf Model ; 60(4): 2073-2081, 2020 04 27.
Article in English | MEDLINE | ID: mdl-32202780

ABSTRACT

The primary goal of ligand-based virtual screening is to identify active compounds consisting of a core scaffold that is not found in the current active compound pool. Scaffold hopping is the term used for this purpose. In the present study, topological representations of pharmacophore features on chemical graphs were investigated for scaffold hopping. Pharmacophore graphs (PhGs), which consist of pharmacophore features as nodes and their topological distances as edges, were used as a representation of important information on compounds being active. We investigated ranking methods for prioritizing PhGs for scaffold hopping. The proposed method, NScaffold, which ranks PhGs based on the number of scaffolds covered by the PhGs, outperforms other conventional methods. As a demonstrative case, using a thrombin inhibitor data set, we interpreted the highest-ranked PhGs by NScaffold from the protein-ligand interaction point of view. It resulted that the NScaffold method successfully retrieved three known important interactions, showing the potential for identifying scaffold-hopped compounds with interpretable PhGs.


Subject(s)
Receptors, Drug , Ligands
20.
ACS Omega ; 4(12): 15304-15311, 2019 Sep 17.
Article in English | MEDLINE | ID: mdl-31552377

ABSTRACT

Similarity searching (SS) is a core approach in computational compound screening and has a long tradition in pharmaceutical research. Over the years, different approaches have been introduced to increase the information content of search calculations and optimize the ability to detect compounds having similar activity. We present a large-scale comparison of distinct search strategies on more than 600 qualifying compound activity classes. Challenging test cases for SS were identified and used to evaluate different ways to further improve search performance, which provided a differentiated view of alternative search strategies and their relative performance. It was found that search results could not only be improved by increasing compound input information but also by focusing similarity calculations on database compounds. In the presence of multiple active reference compounds, asymmetric SS with high weights on chemical features of target compounds emerged as an overall preferred approach across many different activity classes. These findings have implications for practical virtual screening applications.

SELECTION OF CITATIONS
SEARCH DETAIL
...