Search | VHL Regional Portal

Multi-Task ADME/PK prediction at industrial scale: leveraging large and diverse experimentaldatasets.

Walter, Moritz; Borghardt, Jens M; Humbeck, Lina; Skalic, Miha.

Mol Inform ; : e202400079, 2024 Jul 08.

Article in English | MEDLINE | ID: mdl-38973777

ABSTRACT

ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to judge whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In this study, we tested multi-task machine learning (ML) models to predict ADME and animal PK endpoints trained on in-house data generated at Boehringer Ingelheim. Models were evaluated both at the design stage of a compound (i. e., no experimental data of test compounds available) and at testing stage when a particular assay would be conducted (i. e., experimental data of earlier conducted assays may be available). Using realistic time-splits, we found a clear benefit in performance of multi-task graph-based neural network models over single-task model, which was even stronger when experimental data of earlier assays is available. In an attempt to explain the success of multi-task models, we found that especially endpoints with the largest numbers of data points (physicochemical endpoints, clearance in microsomes) are responsible for increased predictivity in more complex ADME and PK endpoints. In summary, our study provides insight into how data for multiple ADME/PK endpoints in a pharmaceutical company can be best leveraged to optimize predictivity of ML models.

Benchmarking Molecular Feature Attribution Methods with Activity Cliffs.

Jiménez-Luna, José; Skalic, Miha; Weskamp, Nils.

J Chem Inf Model ; 62(2): 274-283, 2022 01 24.

Article in English | MEDLINE | ID: mdl-35019265

ABSTRACT

Feature attribution techniques are popular choices within the explainable artificial intelligence toolbox, as they can help elucidate which parts of the provided inputs used by an underlying supervised-learning method are considered relevant for a specific prediction. In the context of molecular design, these approaches typically involve the coloring of molecular graphs, whose presentation to medicinal chemists can be useful for making a decision of which compounds to synthesize or prioritize. The consistency of the highlighted moieties alongside expert background knowledge is expected to contribute to the understanding of machine-learning models in drug design. Quantitative evaluation of such coloring approaches, however, has so far been limited to substructure identification tasks. We here present an approach that is based on maximum common substructure algorithms applied to experimentally-determined activity cliffs. Using the proposed benchmark, we found that molecule coloring approaches in conjunction with classical machine-learning models tend to outperform more modern, graph-neural-network alternatives. The provided benchmark data are fully open sourced, which we hope will facilitate the testing of newly developed molecular feature attribution techniques.

Subject(s)

Artificial Intelligence , Benchmarking , Algorithms , Machine Learning , Neural Networks, Computer

Coloring Molecules with Explainable Artificial Intelligence for Preclinical Relevance Assessment.

Jiménez-Luna, José; Skalic, Miha; Weskamp, Nils; Schneider, Gisbert.

J Chem Inf Model ; 61(3): 1083-1094, 2021 03 22.

Article in English | MEDLINE | ID: mdl-33629843

ABSTRACT

Graph neural networks are able to solve certain drug discovery tasks such as molecular property prediction and de novo molecule generation. However, these models are considered "black-box" and "hard-to-debug". This study aimed to improve modeling transparency for rational molecular design by applying the integrated gradients explainable artificial intelligence (XAI) approach for graph neural network models. Models were trained for predicting plasma protein binding, hERG channel inhibition, passive permeability, and cytochrome P450 inhibition. The proposed methodology highlighted molecular features and structural elements that are in agreement with known pharmacophore motifs, correctly identified property cliffs, and provided insights into unspecific ligand-target interactions. The developed XAI approach is fully open-sourced and can be used by practitioners to train new models on other clinically relevant endpoints.

Subject(s)

Artificial Intelligence , Neural Networks, Computer , Drug Discovery , Ligands

Transcriptomics unravels the adaptive molecular mechanisms of Brettanomyces bruxellensis under SO₂ stress in wine condition.

Valdetara, Federica; Skalic, Miha; Fracassetti, Daniela; Louw, Marli; Compagno, Concetta; du Toit, Maret; Foschino, Roberto; Petrovic, Uros; Divol, Benoit; Vigentini, Ileana.

Food Microbiol ; 90: 103483, 2020 Sep.

Article in English | MEDLINE | ID: mdl-32336374

ABSTRACT

Sulfur dioxide is generally used as an antimicrobial in wine to counteract the activity of spoilage yeasts, including Brettanomyces bruxellensis. However, this chemical does not exert the same effectiveness on different B. bruxellensis yeasts since some strains can proliferate in the final product leading to a negative sensory profile due to 4-ethylguaiacol and 4-ethylphenol. Thus, the capability of deciphering the general molecular mechanisms characterizing this yeast species' response in presence of SO2 stress could be considered strategic for a better management of SO2 in winemaking. A RNA-Seq approach was used to investigate the gene expression of two strains of B. bruxellensis, AWRI 1499 and CBS 2499 having different genetic backgrounds, when exposed to a SO2 pulse. Results revealed that sulphites affected yeast culturability and metabolism, but not volatile phenol production suggesting that a phenotypical heterogeneity could be involved for the SO2 cell adaptation. The transcriptomics variation in response to SO2 stress confirmed the strain-related response in B. bruxellensis and the GO analysis of common differentially expressed genes showed that the detoxification process carried out by SSU1 gene can be considered as the principal specific adaptive response to counteract the SO2 presence. However, nonspecific mechanisms can be exploited by cells to assist the SO2 tolerance; namely, the metabolisms related to sugar alcohol (polyols) and oxidative stress, and structural compounds.

Subject(s)

Brettanomyces/genetics , Brettanomyces/metabolism , Fermentation , Stress, Physiological , Sulfur Dioxide/metabolism , Wine/microbiology , Food Microbiology , Gene Expression Profiling , RNA-Seq , Transcriptome

From Target to Drug: Generative Modeling for the Multimodal Structure-Based Ligand Design.

Skalic, Miha; Sabbadin, Davide; Sattarov, Boris; Sciabola, Simone; De Fabritiis, Gianni.

Mol Pharm ; 16(10): 4282-4291, 2019 10 07.

Article in English | MEDLINE | ID: mdl-31437001

ABSTRACT

Chemical space is impractically large, and conventional structure-based virtual screening techniques cannot be used to simply search through the entire space to discover effective bioactive molecules. To address this shortcoming, we propose a generative adversarial network to generate, rather than search, diverse three-dimensional ligand shapes complementary to the pocket. Furthermore, we show that the generated molecule shapes can be decoded using a shape-captioning network into a sequence of SMILES enabling directly the structure-based de novo drug design. We evaluate the quality of the method by both structure- (docking) and ligand-based [quantitative structure-activity relationship (QSAR)] virtual screening methods. For both evaluation approaches, we observed enrichment compared to random sampling from initial chemical space of ZINC drug-like compounds.

Subject(s)

Drug Design , Drug Discovery , Models, Chemical , Neural Networks, Computer , Proteins/chemistry , Small Molecule Libraries/chemistry , Humans , Ligands , Molecular Conformation , Proteins/metabolism , Quantitative Structure-Activity Relationship , Small Molecule Libraries/metabolism

Shape-Based Generative Modeling for de Novo Drug Design.

Skalic, Miha; Jiménez, José; Sabbadin, Davide; De Fabritiis, Gianni.

J Chem Inf Model ; 59(3): 1205-1214, 2019 03 25.

Article in English | MEDLINE | ID: mdl-30762364

ABSTRACT

In this work, we propose a machine learning approach to generate novel molecules starting from a seed compound, its three-dimensional (3D) shape, and its pharmacophoric features. The pipeline draws inspiration from generative models used in image analysis and represents a first example of the de novo design of lead-like molecules guided by shape-based features. A variational autoencoder is used to perturb the 3D representation of a compound, followed by a system of convolutional and recurrent neural networks that generate a sequence of SMILES tokens. The generative design of novel scaffolds and functional groups can cover unexplored regions of chemical space that still possess lead-like properties.

Subject(s)

Machine Learning , Pharmaceutical Preparations/chemistry , Drug Design , Hydrogen Bonding , Hydrophobic and Hydrophilic Interactions , Models, Molecular , Molecular Conformation , Molecular Structure , Quantitative Structure-Activity Relationship

LigVoxel: inpainting binding pockets using 3D-convolutional neural networks.

Skalic, Miha; Varela-Rial, Alejandro; Jiménez, José; Martínez-Rosell, Gerard; De Fabritiis, Gianni.

Bioinformatics ; 35(2): 243-250, 2019 01 15.

Article in English | MEDLINE | ID: mdl-29982392

ABSTRACT

Motivation: Structure-based drug discovery methods exploit protein structural information to design small molecules binding to given protein pockets. This work proposes a purely data driven, structure-based approach for imaging ligands as spatial fields in target protein pockets. We use an end-to-end deep learning framework trained on experimental protein-ligand complexes with the intention of mimicking a chemist's intuition at manually placing atoms when designing a new compound. We show that these models can generate spatial images of ligand chemical properties like occupancy, aromaticity and donor-acceptor matching the protein pocket. Results: The predicted fields considerably overlap with those of unseen ligands bound to the target pocket. Maximization of the overlap between the predicted fields and a given ligand on the Astex diverse set recovers the original ligand crystal poses in 70 out of 85 cases within a threshold of 2 Å RMSD. We expect that these models can be used for guiding structure-based drug discovery approaches. Availability and implementation: LigVoxel is available as part of the PlayMolecule.org molecular web application suite. Supplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

Drug Discovery , Neural Networks, Computer , Proteins/chemistry , Software , Binding Sites , Computational Biology , Ligands , Protein Binding , Protein Conformation

PlayMolecule BindScope: large scale CNN-based virtual screening on the web.

Skalic, Miha; Martínez-Rosell, Gerard; Jiménez, José; De Fabritiis, Gianni.

Bioinformatics ; 35(7): 1237-1238, 2019 04 01.

Article in English | MEDLINE | ID: mdl-30169549

ABSTRACT

SUMMARY: Virtual screening pipelines are one of the most popular used tools in structure-based drug discovery, since they can can reduce both time and cost associated with experimental assays. Recent advances in deep learning methodologies have shown that these outperform classical scoring functions at discriminating binder protein-ligand complexes. Here, we present BindScope, a web application for large-scale active-inactive classification of compounds based on deep convolutional neural networks. Performance is on a pair with current state-of-the-art pipelines. Users can screen on the order of hundreds of compounds at once and interactively visualize the results. AVAILABILITY AND IMPLEMENTATION: BindScope is available as part of the PlayMolecule.org web application suite. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Drug Discovery , Internet , Deep Learning , Drug Discovery/methods , Ligands , Neural Networks, Computer

SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions.

Trincado, Juan L; Entizne, Juan C; Hysenaj, Gerald; Singh, Babita; Skalic, Miha; Elliott, David J; Eyras, Eduardo.

Genome Biol ; 19(1): 40, 2018 03 23.

Article in English | MEDLINE | ID: mdl-29571299

ABSTRACT

Despite the many approaches to study differential splicing from RNA-seq, many challenges remain unsolved, including computing capacity and sequencing depth requirements. Here we present SUPPA2, a new method that addresses these challenges, and enables streamlined analysis across multiple conditions taking into account biological variability. Using experimental and simulated data, we show that SUPPA2 achieves higher accuracy compared to other methods, especially at low sequencing depth and short read length. We use SUPPA2 to identify novel Transformer2-regulated exons, novel microexons induced during differentiation of bipolar neurons, and novel intron retention events during erythroblast differentiation.

Subject(s)

Alternative Splicing , Sequence Analysis, RNA , Cell Line , Erythroblasts/metabolism , Exons , Humans , Neurons/metabolism , Software

10.

K_DEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks.

Jiménez, José; Skalic, Miha; Martínez-Rosell, Gerard; De Fabritiis, Gianni.

J Chem Inf Model ; 58(2): 287-296, 2018 02 26.

Article in English | MEDLINE | ID: mdl-29309725

ABSTRACT

Accurately predicting protein-ligand binding affinities is an important problem in computational chemistry since it can substantially accelerate drug discovery for virtual screening and lead optimization. We propose here a fast machine-learning approach for predicting binding affinities using state-of-the-art 3D-convolutional neural networks and compare this approach to other machine-learning and scoring methods using several diverse data sets. The results for the standard PDBbind (v.2016) core test-set are state-of-the-art with a Pearson's correlation coefficient of 0.82 and a RMSE of 1.27 in pK units between experimental and predicted affinity, but accuracy is still very sensitive to the specific protein used. KDEEP is made available via PlayMolecule.org for users to test easily their own protein-ligand complexes, with each prediction taking a fraction of a second. We believe that the speed, performance, and ease of use of KDEEP makes it already an attractive scoring function for modern computational chemistry pipelines.

Subject(s)

Computational Biology/methods , Deep Learning , Proteins/chemistry , Databases, Protein , Drug Discovery , Ligands , Models, Chemical , Protein Binding , Structure-Activity Relationship

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL