Search | VHL Regional Portal

1.

Deep Learning Coordinate-Free Quantum Chemistry.

Matlock, Matthew K; Hoffman, Max; Dang, Na Le; Folmsbee, Dakota L; Langkamp, Luke A; Hutchison, Geoffrey R; Kumar, Neeraj; Sarullo, Kathryn; Swamidass, S Joshua.

J Phys Chem A ; 125(40): 8978-8986, 2021 Oct 14.

Article in English | MEDLINE | ID: mdl-34609871

ABSTRACT

Computing quantum chemical properties of small molecules and polymers can provide insights valuable into physicists, chemists, and biologists when designing new materials, catalysts, biological probes, and drugs. Deep learning can compute quantum chemical properties accurately in a fraction of time required by commonly used methods such as density functional theory. Most current approaches to deep learning in quantum chemistry begin with geometric information from experimentally derived molecular structures or pre-calculated atom coordinates. These approaches have many useful applications, but they can be costly in time and computational resources. In this study, we demonstrate that accurate quantum chemical computations can be performed without geometric information by operating in the coordinate-free domain using deep learning on graph encodings. Coordinate-free methods rely only on molecular graphs, the connectivity of atoms and bonds, without atom coordinates or bond distances. We also find that the choice of graph-encoding architecture substantially affects the performance of these methods. The structures of these graph-encoding architectures provide an opportunity to probe an important, outstanding question in quantum mechanics: what types of quantum chemical properties can be represented by local variable models? We find that Wave, a local variable model, accurately calculates the quantum chemical properties, while graph convolutional architectures require global variables. Furthermore, local variable Wave models outperform global variable graph convolution models on complex molecules with large, correlated systems.

2.

'Black Box' to 'Conversational' Machine Learning: Ondansetron Reduces Risk of Hospital-Acquired Venous Thromboembolism.

Datta, Arghya; Matlock, Matthew K; Le Dang, Na; Moulin, Thiago; Woeltje, Keith F; Yanik, Elizabeth L; Joshua Swamidass, Sanjay.

IEEE J Biomed Health Inform ; 25(6): 2204-2214, 2021 06.

Article in English | MEDLINE | ID: mdl-33095721

ABSTRACT

Machine learning, combined with a proliferation of electronic healthcare records (EHR), has the potential to transform medicine by identifying previously unknown interventions that reduce the risk of adverse outcomes. To realize this potential, machine learning must leave the conceptual 'black box' in complex domains to overcome several pitfalls, like the presence of confounding variables. These variables predict outcomes but are not causal, often yielding uninformative models. In this work, we envision a 'conversational' approach to design machine learning models, which couple modeling decisions to domain expertise. We demonstrate this approach via a retrospective cohort study to identify factors which affect the risk of hospital-acquired venous thromboembolism (HA-VTE). Using logistic regression for modeling, we have identified drugs that reduce the risk of HA-VTE. Our analysis reveals that ondansetron, an anti-nausea and anti-emetic medication, commonly used in treating side-effects of chemotherapy and post-general anesthesia period, substantially reduces the risk of HA-VTE when compared to aspirin (11% vs. 15% relative risk reduction or RRR, respectively). The low cost and low morbidity of ondansetron may justify further inquiry into its use as a preventative agent for HA-VTE. This case study highlights the importance of engaging domain expertise while applying machine learning in complex domains.

Subject(s)

Venous Thromboembolism , Hospitals , Humans , Machine Learning , Ondansetron/therapeutic use , Retrospective Studies , Risk Factors , Venous Thromboembolism/epidemiology , Venous Thromboembolism/prevention & control

3.

Site-Level Bioactivity of Small-Molecules from Deep-Learned Representations of Quantum Chemistry.

Sarullo, Kathryn; Matlock, Matthew K; Swamidass, S Joshua.

J Phys Chem A ; 124(44): 9194-9202, 2020 Nov 05.

Article in English | MEDLINE | ID: mdl-33084331

ABSTRACT

Atom- or bond-level chemical properties of interest in medicinal chemistry, such as drug metabolism and electrophilic reactivity, are important to understand and predict across arbitrary new molecules. Deep learning can be used to map molecular structures to their chemical properties, but the data sets for these tasks are relatively small, which can limit accuracy and generalizability. To overcome this limitation, it would be preferable to model these properties on the basis of the underlying quantum chemical characteristics of small molecules. However, it is difficult to learn higher level chemical properties from lower level quantum calculations. To overcome this challenge, we pretrained deep learning models to compute quantum chemical properties and then reused the intermediate representations constructed by the pretrained network. Transfer learning, in this way, substantially outperformed models based on chemical graphs alone or quantum chemical properties alone. This result was robust, observable in five prediction tasks: identifying sites of epoxidation by metabolic enzymes and identifying sites of covalent reactivity with cyanide, glutathione, DNA and protein. We see that this approach may substantially improve the accuracy of deep learning models for specific chemical structures, such as aromatic systems.

Subject(s)

Deep Learning , Quantum Theory , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology

4.

Deep learning quantification of percent steatosis in donor liver biopsy frozen sections.

Sun, Lulu; Marsh, Jon N; Matlock, Matthew K; Chen, Ling; Gaut, Joseph P; Brunt, Elizabeth M; Swamidass, S Joshua; Liu, Ta-Chiang.

EBioMedicine ; 60: 103029, 2020 Oct.

Article in English | MEDLINE | ID: mdl-32980688

ABSTRACT

BACKGROUND: Pathologist evaluation of donor liver biopsies provides information for accepting or discarding potential donor livers. Due to the urgent nature of the decision process, this is regularly performed using frozen sectioning at the time of biopsy. The percent steatosis in a donor liver biopsy correlates with transplant outcome, however there is significant inter- and intra-observer variability in quantifying steatosis, compounded by frozen section artifact. We hypothesized that a deep learning model could identify and quantify steatosis in donor liver biopsies. METHODS: We developed a deep learning convolutional neural network that generates a steatosis probability map from an input whole slide image (WSI) of a hematoxylin and eosin-stained frozen section, and subsequently calculates the percent steatosis. Ninety-six WSI of frozen donor liver sections from our transplant pathology service were annotated for steatosis and used to train (n = 30 WSI) and test (n = 66 WSI) the deep learning model. FINDINGS: The model had good correlation and agreement with the annotation in both the training set (r of 0.88, intraclass correlation coefficient [ICC] of 0.88) and novel input test sets (r = 0.85 and ICC=0.85). These measurements were superior to the estimates of the on-service pathologist at the time of initial evaluation (r = 0.52 and ICC=0.52 for the training set, and r = 0.74 and ICC=0.72 for the test set). INTERPRETATION: Use of this deep learning algorithm could be incorporated into routine pathology workflows for fast, accurate, and reproducible donor liver evaluation. FUNDING: Mid-America Transplant Society.

Subject(s)

Deep Learning , Fatty Liver/pathology , Living Donors , Algorithms , Biopsy , Fatty Liver/diagnostic imaging , Frozen Sections , Humans , Image Processing, Computer-Assisted/methods , Immunohistochemistry , Liver Transplantation , Molecular Sequence Annotation , Neural Networks, Computer , Severity of Illness Index

5.

The Metabolic Rainbow: Deep Learning Phase I Metabolism in Five Colors.

Dang, Na Le; Matlock, Matthew K; Hughes, Tyler B; Swamidass, S Joshua.

J Chem Inf Model ; 60(3): 1146-1164, 2020 03 23.

Article in English | MEDLINE | ID: mdl-32040319

ABSTRACT

Metabolism of drugs affects their absorption, distribution, efficacy, excretion, and toxicity profiles. Metabolism is routinely assessed experimentally using recombinant enzymes, human liver microsome, and animal models. Unfortunately, these experiments are expensive, time-consuming, and often extrapolate poorly to humans because they fail to capture the full breadth of metabolic reactions observed in vivo. As a result, metabolic pathways leading to the formation of toxic metabolites are often missed during drug development, giving rise to costly failures. To address some of these limitations, computational metabolism models can rapidly and cost-effectively predict sites of metabolism-the atoms or bonds which undergo enzymatic modifications-on thousands of drug candidates, thereby improving the likelihood of discovering metabolic transformations forming toxic metabolites. However, current computational metabolism models are often unable to predict the specific metabolites formed by metabolism at certain sites. Identification of reaction type is a key step toward metabolite prediction. Phase I enzymes, which are responsible for the metabolism of more than 90% of FDA approved drugs, catalyze highly diverse types of reactions and produce metabolites with substantial structural variability. Without knowledge of potential metabolite structures, medicinal chemists cannot differentiate harmful metabolic transformations from beneficial ones. To address this shortcoming, we propose a system for simultaneously labeling sites of metabolism and reaction types, by classifying them into five key reaction classes: stable and unstable oxidations, dehydrogenation, hydrolysis, and reduction. These classes unambiguously identify 21 types of phase I reactions, which cover 92.3% of known reactions in our database. We used this labeling system to train a neural network model of phase I metabolism on a literature-derived data set encompassing 20â¯736 human phase I metabolic reactions. Our model, Rainbow XenoSite, was able to identify reaction-type specific sites of metabolism with a cross-validated accuracy of 97.1% area under the receiver operator curve. Rainbow XenoSite with five-color and combined output is available for use free and online through our secure server at http://swami.wustl.edu/xenosite/p/phase1_rainbow.

Subject(s)

Deep Learning , Animals , Color , Humans , Metabolic Networks and Pathways , Microsomes, Liver , Neural Networks, Computer

6.

A Time-Embedding Network Models the Ontogeny of 23 Hepatic Drug Metabolizing Enzymes.

Matlock, Matthew K; Tambe, Abhik; Elliott-Higgins, Jack; Hines, Ronald N; Miller, Grover P; Swamidass, S Joshua.

Chem Res Toxicol ; 32(8): 1707-1721, 2019 08 19.

Article in English | MEDLINE | ID: mdl-31304741

ABSTRACT

Pediatric patients are at elevated risk of adverse drug reactions, and there is insufficient information on drug safety in children. Complicating risk assessment in children, there are numerous age-dependent changes in the absorption, distribution, metabolism, and elimination of drugs. A key contributor to age-dependent drug toxicity risk is the ontogeny of drug metabolism enzymes, the changes in both abundance and type throughout development from the fetal period through adulthood. Critically, these changes affect not only the overall clearance of drugs but also exposure to individual metabolites. In this study, we introduce time-embedding neural networks in order to model population-level variation in metabolism enzyme expression as a function of age. We use a time-embedding network to model the ontogeny of 23 drug metabolism enzymes. The time-embedding network recapitulates known demographic factors impacting 3A5 expression. The time-embedding network also effectively models the nonlinear dynamics of 2D6 expression, enabling a better fit to clinical data than prior work. In contrast, a standard neural network fails to model these features of 3A5 and 2D6 expression. Finally, we combine the time-embedding model of ontogeny with additional information to estimate age-dependent changes in reactive metabolite exposure. This simple approach identifies age-dependent changes in exposure to valproic acid and dextromethorphan metabolites and suggests potential mechanisms of valproic acid toxicity. This approach may help researchers evaluate the risk of drug toxicity in pediatric populations.

Subject(s)

Liver Neoplasms/metabolism , Neural Networks, Computer , Adolescent , Carboxylesterase/metabolism , Child , Child, Preschool , Cytochrome P-450 Enzyme System/metabolism , Glucuronosyltransferase/metabolism , Glutathione Transferase/metabolism , Humans , Inactivation, Metabolic , Infant , Oxygenases/metabolism , Principal Component Analysis , Sulfurtransferases/metabolism , Time Factors

7.

Modeling Small-Molecule Reactivity Identifies Promiscuous Bioactive Compounds.

Matlock, Matthew K; Hughes, Tyler B; Dahlin, Jayme L; Swamidass, S Joshua.

J Chem Inf Model ; 58(8): 1483-1500, 2018 08 27.

Article in English | MEDLINE | ID: mdl-29990427

ABSTRACT

Scientists rely on high-throughput screening tools to identify promising small-molecule compounds for the development of biochemical probes and drugs. This study focuses on the identification of promiscuous bioactive compounds, which are compounds that appear active in many high-throughput screening experiments against diverse targets but are often false-positives which may not be easily developed into successful probes. These compounds can exhibit bioactivity due to nonspecific, intractable mechanisms of action and/or by interference with specific assay technology readouts. Such "frequent hitters" are now commonly identified using substructure filters, including pan assay interference compounds (PAINS). Herein, we show that mechanistic modeling of small-molecule reactivity using deep learning can improve upon PAINS filters when modeling promiscuous bioactivity in PubChem assays. Without training on high-throughput screening data, a deep learning model of small-molecule reactivity achieves a sensitivity and specificity of 18.5% and 95.5%, respectively, in identifying promiscuous bioactive compounds. This performance is similar to PAINS filters, which achieve a sensitivity of 20.3% at the same specificity. Importantly, such reactivity modeling is complementary to PAINS filters. When PAINS filters and reactivity models are combined, the resulting model outperforms either method alone, achieving a sensitivity of 24% at the same specificity. However, as a probabilistic model, the sensitivity and specificity of the deep learning model can be tuned by adjusting the threshold. Moreover, for a subset of PAINS filters, this reactivity model can help discriminate between promiscuous and nonpromiscuous bioactive compounds even among compounds matching those filters. Critically, the reactivity model provides mechanistic hypotheses for assay interference by predicting the precise atoms involved in compound reactivity. Overall, our analysis suggests that deep learning approaches to modeling promiscuous compound bioactivity may provide a complementary approach to current methods for identifying promiscuous compounds.

Subject(s)

Drug Discovery/methods , High-Throughput Screening Assays/methods , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Animals , Computer Simulation , Databases, Factual , Enzyme Inhibitors/chemistry , Enzyme Inhibitors/pharmacology , Histone Acetyltransferases/antagonists & inhibitors , Histone Acetyltransferases/metabolism , Humans , Models, Biological , Neural Networks, Computer

8.

Deep Learning Global Glomerulosclerosis in Transplant Kidney Frozen Sections.

Marsh, Jon N; Matlock, Matthew K; Kudose, Satoru; Liu, Ta-Chiang; Stappenbeck, Thaddeus S; Gaut, Joseph P; Swamidass, S Joshua.

IEEE Trans Med Imaging ; 37(12): 2718-2728, 2018 12.

Article in English | MEDLINE | ID: mdl-29994669

ABSTRACT

Transplantable kidneys are in very limited supply. Accurate viability assessment prior to transplantation could minimize organ discard. Rapid and accurate evaluation of intra-operative donor kidney biopsies is essential for determining which kidneys are eligible for transplantation. The criterion for accepting or rejecting donor kidneys relies heavily on pathologist determination of the percent of glomeruli (determined from a frozen section) that are normal and sclerotic. This percentage is a critical measurement that correlates with transplant outcome. Inter- and intra-observer variability in donor biopsy evaluation is, however, significant. An automated method for determination of percent global glomerulosclerosis could prove useful in decreasing evaluation variability, increasing throughput, and easing the burden on pathologists. Here, we describe the development of a deep learning model that identifies and classifies non-sclerosed and sclerosed glomeruli in whole-slide images of donor kidney frozen section biopsies. This model extends a convolutional neural network (CNN) pre-trained on a large database of digital images. The extended model, when trained on just 48 whole slide images, exhibits slide-level evaluation performance on par with expert renal pathologists. Encouragingly, the model's performance is robust to slide preparation artifacts associated with frozen section preparation. The model substantially outperforms a model trained on image patches of isolated glomeruli, in terms of both accuracy and speed. The methodology overcomes the technical challenge of applying a pretrained CNN bottleneck model to whole-slide image classification. The traditional patch-based approach, while exhibiting deceptively good performance classifying isolated patches, does not translate successfully to whole-slide image segmentation in this setting. As the first model reported that identifies and classifies normal and sclerotic glomeruli in frozen kidney sections, and thus the first model reported in the literature relevant to kidney transplantation, it may become an essential part of donor kidney biopsy evaluation in the clinical setting.

Subject(s)

Deep Learning , Glomerulonephritis/diagnostic imaging , Image Interpretation, Computer-Assisted/methods , Kidney/diagnostic imaging , Transplants/diagnostic imaging , Algorithms , Frozen Sections , Humans , Kidney Transplantation

9.

Learning a Local-Variable Model of Aromatic and Conjugated Systems.

Matlock, Matthew K; Dang, Na Le; Swamidass, S Joshua.

ACS Cent Sci ; 4(1): 52-62, 2018 Jan 24.

Article in English | MEDLINE | ID: mdl-29392176

ABSTRACT

A collection of new approaches to building and training neural networks, collectively referred to as deep learning, are attracting attention in theoretical chemistry. Several groups aim to replace computationally expensive ab initio quantum mechanics calculations with learned estimators. This raises questions about the representability of complex quantum chemical systems with neural networks. Can local-variable models efficiently approximate nonlocal quantum chemical features? Here, we find that convolutional architectures, those that only aggregate information locally, cannot efficiently represent aromaticity and conjugation in large systems. They cannot represent long-range nonlocality known to be important in quantum chemistry. This study uses aromatic and conjugated systems computed from molecule graphs, though reproducing quantum simulations is the ultimate goal. This task, by definition, is both computable and known to be important to chemistry. The failure of convolutional architectures on this focused task calls into question their use in modeling quantum mechanics. To remedy this heretofore unrecognized deficiency, we introduce a new architecture that propagates information back and forth in waves of nonlinear computation. This architecture is still a local-variable model, and it is both computationally and representationally efficient, processing molecules in sublinear time with far fewer parameters than convolutional networks. Wave-like propagation models aromatic and conjugated systems with high accuracy, and even models the impact of small structural changes on large molecules. This new architecture demonstrates that some nonlocal features of quantum chemistry can be efficiently represented in local variable models.

10.

XenoSite server: a web-available site of metabolism prediction tool.

Matlock, Matthew K; Hughes, Tyler B; Swamidass, S Joshua.

Bioinformatics ; 31(7): 1136-7, 2015 Apr 01.

Article in English | MEDLINE | ID: mdl-25411327

ABSTRACT

UNLABELLED: Cytochrome P450 enzymes (P450s) are metabolic enzymes that process the majority of FDA-approved, small-molecule drugs. Understanding how these enzymes modify molecule structure is key to the development of safe, effective drugs. XenoSite server is an online implementation of the XenoSite, a recently published computational model for P450 metabolism. XenoSite predicts which atomic sites of a molecule--sites of metabolism (SOMs)--are modified by P450s. XenoSite server accepts input in common chemical file formats including SDF and SMILES and provides tools for visualizing the likelihood that each atomic site is a site of metabolism for a variety of important P450s, as well as a flat file download of SOM predictions. AVAILABILITY AND IMPLEMENTATION: XenoSite server is available at http://swami.wustl.edu/xenosite.

Subject(s)

Computational Biology/methods , Cytochrome P-450 Enzyme System/metabolism , Dibenzothiepins/metabolism , Internet , Metabolic Networks and Pathways , Xenobiotics/metabolism , Antipsychotic Agents/metabolism , Cytochrome P-450 Enzyme System/chemistry , Humans , Molecular Docking Simulation , Neural Networks, Computer , Probability , Small Molecule Libraries/metabolism

11.

ProteomeScout: a repository and analysis resource for post-translational modifications and proteins.

Matlock, Matthew K; Holehouse, Alex S; Naegle, Kristen M.

Nucleic Acids Res ; 43(Database issue): D521-30, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25414335

ABSTRACT

ProteomeScout (https://proteomescout.wustl.edu) is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations. The PTM database is a compendium of public PTM data, coupled with user-uploaded experimental data. ProteomeScout provides analysis tools for experimental datasets, including summary views and subset selection, which can identify relationships within subsets of data by testing for statistically significant enrichment of protein annotations. Protein annotations are incorporated in the ProteomeScout database from external resources and include terms such as Gene Ontology annotations, domains, secondary structure and non-synonymous polymorphisms. These annotations are available in the database download, in the analysis tools and in the protein viewer. The protein viewer allows for the simultaneous visualization of annotations in an interactive web graphic, which can be exported in Scalable Vector Graphics (SVG) format. Finally, quantitative data measurements associated with public experiments are also easily viewable within protein records, allowing researchers to see how PTMs change across different contexts. ProteomeScout should prove useful for protein researchers and should benefit the proteomics community by providing a stable repository for PTM experiments.

Subject(s)

Databases, Protein , Protein Processing, Post-Translational , Internet , Molecular Sequence Annotation , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Proteomics

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL