Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
NPJ Precis Oncol ; 8(1): 95, 2024 Apr 24.
Article in English | MEDLINE | ID: mdl-38658785

ABSTRACT

Machine learning (ML) models of drug sensitivity prediction are becoming increasingly popular in precision oncology. Here, we identify a fundamental limitation in standard measures of drug sensitivity that hinders the development of personalized prediction models - they focus on absolute effects but do not capture relative differences between cancer subtypes. Our work suggests that using z-scored drug response measures mitigates these limitations and leads to meaningful predictions, opening the door for sophisticated ML precision oncology models.

3.
Nat Commun ; 14(1): 3686, 2023 Jun 21.
Article in English | MEDLINE | ID: mdl-37344485

ABSTRACT

Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. Here we show that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. We demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization-although we provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, we show how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output.

4.
J Chem Inf Model ; 62(18): 4295-4299, 2022 09 26.
Article in English | MEDLINE | ID: mdl-36098536

ABSTRACT

Recent work showed that active site rather than full-protein-sequence information improves predictive performance in kinase-ligand binding affinity prediction. To refine the notion of an "active site", we here propose and compare multiple definitions. We report significant evidence that our novel definition is superior to previous definitions and better models of ATP-noncompetitive inhibitors. Moreover, we leverage the discontiguity of the active site sequence to motivate novel protein-sequence augmentation strategies and find that combining them further improves performance.


Subject(s)
Adenosine Triphosphate , Adenosine Triphosphate/metabolism , Amino Acid Sequence , Binding Sites , Ligands , Protein Binding
5.
J Chem Inf Model ; 62(2): 240-257, 2022 01 24.
Article in English | MEDLINE | ID: mdl-34905358

ABSTRACT

Recent advances in deep learning have enabled the development of large-scale multimodal models for virtual screening and de novo molecular design. The human kinome with its abundant sequence and inhibitor data presents an attractive opportunity to develop proteochemometric models that exploit the size and internal diversity of this family of targets. Here, we challenge a standard practice in sequence-based affinity prediction models: instead of leveraging the full primary structure of proteins, each target is represented by a sequence of 29 discontiguous residues defining the ATP binding site. In kinase-ligand binding affinity prediction, our results show that the reduced active site sequence representation is not only computationally more efficient but consistently yields significantly higher performance than the full primary structure. This trend persists across different models, data sets, and performance metrics and holds true when predicting pIC50 for both unseen ligands and kinases. Our interpretability analysis reveals a potential explanation for the superiority of the active site models: whereas only mild statistical effects about the extraction of three-dimensional (3D) interaction sites take place in the full sequence models, the active site models are equipped with an implicit but strong inductive bias about the 3D structure stemming from the discontiguity of the active sites. Moreover, in direct comparisons, our models perform similarly or better than previous state-of-the-art approaches in affinity prediction. We then investigate a de novo molecular design task and find that the active site provides benefits in the computational efficiency, but otherwise, both kinase representations yield similar optimized affinities (for both SMILES- and SELFIES-based molecular generators). Our work challenges the assumption that the full primary structure is indispensable for modeling human kinases.


Subject(s)
Proteins , Binding Sites , Catalytic Domain , Humans , Ligands , Protein Binding , Proteins/metabolism
7.
Bioinformatics ; 37(Suppl_1): i237-i244, 2021 07 12.
Article in English | MEDLINE | ID: mdl-34252922

ABSTRACT

MOTIVATION: The activity of the adaptive immune system is governed by T-cells and their specific T-cell receptors (TCR), which selectively recognize foreign antigens. Recent advances in experimental techniques have enabled sequencing of TCRs and their antigenic targets (epitopes), allowing to research the missing link between TCR sequence and epitope binding specificity. Scarcity of data and a large sequence space make this task challenging, and to date only models limited to a small set of epitopes have achieved good performance. Here, we establish a k-nearest-neighbor (K-NN) classifier as a strong baseline and then propose Tcr epITope bimodal Attention Networks (TITAN), a bimodal neural network that explicitly encodes both TCR sequences and epitopes to enable the independent study of generalization capabilities to unseen TCRs and/or epitopes. RESULTS: By encoding epitopes at the atomic level with SMILES sequences, we leverage transfer learning and data augmentation to enrich the input data space and boost performance. TITAN achieves high performance in the prediction of specificity of unseen TCRs (ROC-AUC 0.87 in 10-fold CV) and surpasses the results of the current state-of-the-art (ImRex) by a large margin. Notably, our Levenshtein-based K-NN classifier also exhibits competitive performance on unseen TCRs. While the generalization to unseen epitopes remains challenging, we report two major breakthroughs. First, by dissecting the attention heatmaps, we demonstrate that the sparsity of available epitope data favors an implicit treatment of epitopes as classes. This may be a general problem that limits unseen epitope performance for sufficiently complex models. Second, we show that TITAN nevertheless exhibits significantly improved performance on unseen epitopes and is capable of focusing attention on chemically meaningful molecular structures. AVAILABILITY AND IMPLEMENTATION: The code as well as the dataset used in this study is publicly available at https://github.com/PaccMann/TITAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Receptors, Antigen, T-Cell , T-Lymphocytes , Epitopes , Humans , Neural Networks, Computer , Receptors, Antigen, T-Cell/genetics , T-Cell Antigen Receptor Specificity
8.
Curr Med Chem ; 28(38): 7862-7886, 2021.
Article in English | MEDLINE | ID: mdl-34325627

ABSTRACT

It is more pressing than ever to reduce the time and costs for the development of lead compounds in the pharmaceutical industry. The co-occurrence of advances in high-throughput screening and the rise of deep learning (DL) have enabled the development of large-scale multimodal predictive models for virtual drug screening. Recently, deep generative models have emerged as a powerful tool to explore the chemical space and raise hopes to expedite the drug discovery process. Following this progress in chemocentric approaches for generative chemistry, the next challenge is to build multimodal conditional generative models that leverage disparate knowledge sources when mapping biochemical properties to target structures. Here, we call the community to bridge drug discovery more closely with systems biology when designing deep generative models. Complementing the plethora of reviews on the role of DL in chemoinformatics, we specifically focus on the interface of predictive and generative modelling for drug discovery. Through a systematic publication keyword search on PubMed and a selection of preprint servers (arXiv, biorXiv, chemRxiv, and medRxiv), we quantify trends in the field and find that molecular graphs and VAEs have become the most widely adopted molecular representations and architectures in generative models, respectively. We discuss progress on DL for toxicity, drug-target affinity, and drug sensitivity prediction and specifically focus on conditional molecular generative models that encompass multimodal prediction models. Moreover, we outline future prospects in the field and identify challenges such as the integration of deep learning systems into experimental workflows in a closed-loop manner or the adoption of federated machine learning techniques to overcome data sharing barriers. Other challenges include, but are not limited to interpretability in generative models, more sophisticated metrics for the evaluation of molecular generative models, and, following up on that, community-accepted benchmarks for both multimodal drug property prediction and property-driven molecular design.


Subject(s)
Deep Learning , Drug Design , Drug Discovery , Humans , Machine Learning , Models, Molecular
9.
Patterns (N Y) ; 2(6): 100269, 2021 Jun 11.
Article in English | MEDLINE | ID: mdl-33969323

ABSTRACT

Although a plethora of research articles on AI methods on COVID-19 medical imaging are published, their clinical value remains unclear. We conducted the largest systematic review of the literature addressing the utility of AI in imaging for COVID-19 patient care. By keyword searches on PubMed and preprint servers throughout 2020, we identified 463 manuscripts and performed a systematic meta-analysis to assess their technical merit and clinical relevance. Our analysis evidences a significant disparity between clinical and AI communities, in the focus on both imaging modalities (AI experts neglected CT and ultrasound, favoring X-ray) and performed tasks (71.9% of AI papers centered on diagnosis). The vast majority of manuscripts were found to be deficient regarding potential use in clinical practice, but 2.7% (n = 12) publications were assigned a high maturity level and are summarized in greater detail. We provide an itemized discussion of the challenges in developing clinically relevant AI solutions with recommendations and remedies.

10.
iScience ; 24(4): 102269, 2021 Apr 23.
Article in English | MEDLINE | ID: mdl-33851095

ABSTRACT

With the advent of deep generative models in computational chemistry, in-silico drug design is undergoing an unprecedented transformation. Although deep learning approaches have shown potential in generating compounds with desired chemical properties, they disregard the cellular environment of target diseases. Bridging systems biology and drug design, we present a reinforcement learning method for de novo molecular design from gene expression profiles. We construct a hybrid Variational Autoencoder that tailors molecules to target-specific transcriptomic profiles, using an anticancer drug sensitivity prediction model (PaccMann) as reward function. Without incorporating information about anticancer drugs, the molecule generation is biased toward compounds with high predicted efficacy against cell lines or cancer types. The generation can be further refined by subsidiary constraints such as toxicity. Our cancer-type-specific candidate drugs are similar to cancer drugs in drug-likeness, synthesizability, and solubility and frequently exhibit the highest structural similarity to compounds with known efficacy against these cancer types.

11.
Nucleic Acids Res ; 48(W1): W502-W508, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32402082

ABSTRACT

The identification of new targeted and personalized therapies for cancer requires the fast and accurate assessment of the drug efficacy of potential compounds against a particular biomolecular sample. It has been suggested that the integration of complementary sources of information might strengthen the accuracy of a drug efficacy prediction model. Here, we present a web-based platform for the Prediction of AntiCancer Compound sensitivity with Multimodal Attention-based Neural Networks (PaccMann). PaccMann is trained on public transcriptomic cell line profiles, compound structure information and drug sensitivity screenings, and outperforms state-of-the-art methods on anticancer drug sensitivity prediction. On the open-access web service (https://ibm.biz/paccmann-aas), users can select a known drug compound or design their own compound structure in an interactive editor, perform in-silico drug testing and investigate compound efficacy on publicly available or user-provided transcriptomic profiles. PaccMann leverages methods for model interpretability and outputs confidence scores as well as attention heatmaps that highlight the genes and chemical sub-structures that were more important to make a prediction, hence facilitating the understanding of the model's decision making and the involved biochemical processes. We hope to serve the community with a toolbox for fast and efficient validation in drug repositioning or lead compound identification regimes.


Subject(s)
Antineoplastic Agents/pharmacology , Drug Repositioning , Software , Antineoplastic Agents/chemistry , Computer Simulation , Gene Expression Profiling , Internet , Neural Networks, Computer , Sirolimus/analogs & derivatives , Sirolimus/pharmacology
12.
IEEE Access ; 8: 179437-179456, 2020.
Article in English | MEDLINE | ID: mdl-34812357

ABSTRACT

The COVID-19 pandemic has triggered an urgent call to contribute to the fight against an immense threat to the human population. Computer Vision, as a subfield of artificial intelligence, has enjoyed recent success in solving various complex problems in health care and has the potential to contribute to the fight of controlling COVID-19. In response to this call, computer vision researchers are putting their knowledge base at test to devise effective ways to counter COVID-19 challenge and serve the global community. New contributions are being shared with every passing day. It motivated us to review the recent work, collect information about available research resources, and an indication of future research directions. We want to make it possible for computer vision researchers to find existing and future research directions. This survey article presents a preliminary review of the literature on research community efforts against COVID-19 pandemic.

13.
Mol Pharm ; 16(12): 4797-4806, 2019 12 02.
Article in English | MEDLINE | ID: mdl-31618586

ABSTRACT

In line with recent advances in neural drug design and sensitivity prediction, we propose a novel architecture for interpretable prediction of anticancer compound sensitivity using a multimodal attention-based convolutional encoder. Our model is based on the three key pillars of drug sensitivity: compounds' structure in the form of a SMILES sequence, gene expression profiles of tumors, and prior knowledge on intracellular interactions from protein-protein interaction networks. We demonstrate that our multiscale convolutional attention-based encoder significantly outperforms a baseline model trained on Morgan fingerprints and a selection of encoders based on SMILES, as well as the previously reported state-of-the-art for multimodal drug sensitivity prediction (R2 = 0.86 and RMSE = 0.89). Moreover, the explainability of our approach is demonstrated by a thorough analysis of the attention weights. We show that the attended genes significantly enrich apoptotic processes and that the drug attention is strongly correlated with a standard chemical structure similarity index. Finally, we report a case study of two receptor tyrosine kinase (RTK) inhibitors acting on a leukemia cell line, showcasing the ability of the model to focus on informative genes and submolecular regions of the two compounds. The demonstrated generalizability and the interpretability of our model testify to its potential for in silico prediction of anticancer compound efficacy on unseen cancer cells, positioning it as a valid solution for the development of personalized therapies as well as for the evaluation of candidate compounds in de novo drug design.


Subject(s)
Algorithms , Antineoplastic Agents , Deep Learning , Drug Design , Humans , Neural Networks, Computer
14.
PLoS One ; 12(5): e0178304, 2017.
Article in English | MEDLINE | ID: mdl-28562618

ABSTRACT

A subset of neurons in the posterior parietal and premotor areas of the primate brain respond to the locations of visual targets in a hand-centred frame of reference. Such hand-centred visual representations are thought to play an important role in visually-guided reaching to target locations in space. In this paper we show how a biologically plausible, Hebbian learning mechanism may account for the development of localized hand-centred representations in a hierarchical neural network model of the primate visual system, VisNet. The hand-centered neurons developed in the model use an invariance learning mechanism known as continuous transformation (CT) learning. In contrast to previous theoretical proposals for the development of hand-centered visual representations, CT learning does not need a memory trace of recent neuronal activity to be incorporated in the synaptic learning rule. Instead, CT learning relies solely on a Hebbian learning rule, which is able to exploit the spatial overlap that naturally occurs between successive images of a hand-object configuration as it is shifted across different retinal locations due to saccades. Our simulations show how individual neurons in the network model can learn to respond selectively to target objects in particular locations with respect to the hand, irrespective of where the hand-object configuration occurs on the retina. The response properties of these hand-centred neurons further generalise to localised receptive fields in the hand-centred space when tested on novel hand-object configurations that have not been explored during training. Indeed, even when the network is trained with target objects presented across a near continuum of locations around the hand during training, the model continues to develop hand-centred neurons with localised receptive fields in hand-centred space. With the help of principal component analysis, we provide the first theoretical framework that explains the behavior of Hebbian learning in VisNet.


Subject(s)
Hand , Learning/physiology , Primates/physiology , Visual Pathways/physiology , Animals , Models, Neurological , Nerve Net
SELECTION OF CITATIONS
SEARCH DETAIL
...