Your browser doesn't support javascript.
Montrer: 20 | 50 | 100
Résultats 1 - 5 de 5
Ajouter des filtres

Type de document
Gamme d'année
biorxiv; 2021.
Preprint Dans Anglais | bioRxiv | ID: ppzbmed-10.1101.2021.12.10.471928


The COVID-19 pandemic highlights the need for computational tools to automate and accelerate drug design for novel protein targets. We leverage deep learning language models to generate and score drug candidates based on predicted protein binding affinity. We pre-trained a deep learning language model (BERT) on ~9.6 billion molecules and achieved peak performance of 603 petaflops in mixed precision. Our work reduces pre-training time from days to hours, compared to previous efforts with this architecture, while also increasing the dataset size by nearly an order of magnitude. For scoring, we fine-tuned the language model using an assembled set of thousands of protein targets with binding affinity data and searched for inhibitors of specific protein targets, SARS-CoV-2 Mpro and PLpro. We utilized a genetic algorithm approach for finding optimal candidates using the generation and scoring capabilities of the language model. Our generalizable models accelerate the identification of inhibitors for emerging therapeutic targets.

biorxiv; 2021.
Preprint Dans Anglais | bioRxiv | ID: ppzbmed-10.1101.2021.10.09.463779


The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) replication transcription complex (RTC) is a multi-domain protein responsible for replicating and transcribing the viral mRNA inside a human cell. Attacking RTC function with pharmaceutical compounds is a pathway to treating COVID-19. Conventional tools, e.g., cryo-electron microscopy and all-atom molecular dynamics (AAMD), do not provide sufficiently high resolution or timescale to capture important dynamics of this molecular machine. Consequently, we develop an innovative workflow that bridges the gap between these resolutions, using mesoscale fluctuating finite element analysis (FFEA) continuum simulations and a hierarchy of AI-methods that continually learn and infer features for maintaining consistency between AAMD and FFEA simulations. We leverage a multi-site distributed workflow manager to orchestrate AI, FFEA, and AAMD jobs, providing optimal resource utilization across HPC centers. Our study provides unprecedented access to study the SARS-CoV-2 RTC machinery, while providing general capability for AI-enabled multi-resolution simulations at scale.

biorxiv; 2021.
Preprint Dans Anglais | bioRxiv | ID: ppzbmed-10.1101.2021.03.31.437918


{beta}-coronaviruses alone have been responsible for three major global outbreaks in the 21st century. The current crisis has led to an urgent requirement to develop therapeutics. Even though a number of vaccines are available, alternative strategies targeting essential viral components are required as a back-up against the emergence of lethal viral variants. One such target is the main protease (Mpro) that plays an indispensible role in viral replication. The availability of over 270 Mpro X-ray structures in complex with inhibitors provides unique insights into ligand-protein interactions. Herein, we provide a comprehensive comparison of all non-redundant ligand-binding sites available for SARS-CoV2, SARS-CoV and MERS-CoV Mpro. Extensive adaptive sampling has been used to explore conformational dynamics employing convolutional variational auto encoder-based deep learning, and investigates structural conservation of the ligand binding sites using Markov state models across {beta}-coronavirus homologs. Our results indicate that not all ligand-binding sites are dynamically conserved despite high sequence and structural conservation across {beta}-coronavirus homologs. This highlights the complexity in targeting all three Mpro enzymes with a single pan inhibitor.

arxiv; 2020.
Preprint Dans Anglais | PREPRINT-ARXIV | ID: ppzbmed-2010.06574v1


The drug discovery process currently employed in the pharmaceutical industry typically requires about 10 years and $2-3 billion to deliver one new drug. This is both too expensive and too slow, especially in emergencies like the COVID-19 pandemic. In silicomethodologies need to be improved to better select lead compounds that can proceed to later stages of the drug discovery protocol accelerating the entire process. No single methodological approach can achieve the necessary accuracy with required efficiency. Here we describe multiple algorithmic innovations to overcome this fundamental limitation, development and deployment of computational infrastructure at scale integrates multiple artificial intelligence and simulation-based approaches. Three measures of performance are:(i) throughput, the number of ligands per unit time; (ii) scientific performance, the number of effective ligands sampled per unit time and (iii) peak performance, in flop/s. The capabilities outlined here have been used in production for several months as the workhorse of the computational infrastructure to support the capabilities of the US-DOE National Virtual Biotechnology Laboratory in combination with resources from the EU Centre of Excellence in Computational Biomedicine.

chemrxiv; 2020.
Preprint Dans Anglais | PREPRINT-CHEMRXIV | ID: ppzbmed-10.26434.chemrxiv.12725465.v1


We present a supercomputer-driven pipeline for in-silico drug discovery using enhanced sampling molecular dynamics (MD) and ensemble docking. We also describe preliminary results obtained for 23 systems involving eight protein targets of the proteome of SARS CoV-2. THe MD performed is temperature replica-exchange enhanced sampling, making use of the massively parallel supercomputing on the SUMMIT supercomputer at Oak Ridge National Laboratory, with which more than 1ms of enhanced sampling MD can be generated per day. We have ensemble docked repurposing databases to ten configurations of each of the 23 SARS CoV-2 systems using AutoDock Vina. We also demonstrate that using Autodock-GPU on SUMMIT, it is possible to perform exhaustive docking of one billion compounds in under 24 hours. Finally, we discuss preliminary results and planned improvements to the pipeline, including the use of quantum mechanical (QM), machine learning, and AI methods to cluster MD trajectories and rescore docking poses.

Détails de la recherche