Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 394
Filter
1.
J Biomed Semantics ; 15(1): 5, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38693563

ABSTRACT

Leveraging AI for synthesizing the deluge of biomedical knowledge has great potential for pharmacological discovery with applications including developing new therapeutics for untreated diseases and repurposing drugs as emergent pandemic treatments. Creating knowledge graph representations of interacting drugs, diseases, genes, and proteins enables discovery via embedding-based ML approaches and link prediction. Previously, it has been shown that these predictive methods are susceptible to biases from network structure, namely that they are driven not by discovering nuanced biological understanding of mechanisms, but based on high-degree hub nodes. In this work, we study the confounding effect of network topology on biological relation semantics by creating an experimental pipeline of knowledge graph semantic and topological perturbations. We show that the drop in drug repurposing performance from ablating meaningful semantics increases by 21% and 38% when mitigating topological bias in two networks. We demonstrate that new methods for representing knowledge and inferring new knowledge must be developed for making use of biomedical semantics for pharmacological innovation, and we suggest fruitful avenues for their development.


Subject(s)
Drug Discovery , Semantics , Drug Discovery/methods , Drug Repositioning/methods
2.
Comput Struct Biotechnol J ; 23: 1320-1338, 2024 Dec.
Article in English | MEDLINE | ID: mdl-38585646

ABSTRACT

Many research groups and institutions have created a variety of databases curating experimental and predicted data related to protein-ligand binding. The landscape of available databases is dynamic, with new databases emerging and established databases becoming defunct. Here, we review the current state of databases that contain binding pockets and protein-ligand binding interactions. We have compiled a list of such databases, fifty-three of which are currently available for use. We discuss variation in how binding pockets are defined and summarize pocket-finding methods. We organize the fifty-three databases into subgroups based on goals and contents, and describe standard use cases. We also illustrate that pockets within the same protein are characterized differently across different databases. Finally, we assess critical issues of sustainability, accessibility and redundancy.

3.
medRxiv ; 2024 Apr 07.
Article in English | MEDLINE | ID: mdl-38633781

ABSTRACT

Electronic health records (EHRs) coupled with large-scale biobanks offer great promises to unravel the genetic underpinnings of treatment efficacy. However, medication-induced biomarker trajectories stemming from such records remain poorly studied. Here, we extract clinical and medication prescription data from EHRs and conduct GWAS and rare variant burden tests in the UK Biobank (discovery) and the All of Us program (replication) on ten cardiometabolic drug response outcomes including lipid response to statins, HbA1c response to metformin and blood pressure response to antihypertensives (N = 740-26,669). Our findings at genome-wide significance level recover previously reported pharmacogenetic signals and also include novel associations for lipid response to statins (N = 26,669) near LDLR and ZNF800. Importantly, these associations are treatment-specific and not associated with biomarker progression in medication-naive individuals. Furthermore, we demonstrate that individuals with higher genetically determined low-density and total cholesterol baseline levels experience increased absolute, albeit lower relative biomarker reduction following statin treatment. In summary, we systematically investigated the common and rare pharmacogenetic contribution to cardiometabolic drug response phenotypes in over 50,000 UK Biobank and All of Us participants with EHR and identified clinically relevant genetic predictors for improved personalized treatment strategies.

4.
Article in English | MEDLINE | ID: mdl-38598857

ABSTRACT

Drug repurposing refers to the inference of therapeutic relationships between a clinical indication and existing compounds. As an emerging paradigm in drug development, drug repurposing enables more efficient treatment of rare diseases, stratified patient populations, and urgent threats to public health. However, prioritizing well-suited drug candidates from among a nearly infinite number of repurposing options continues to represent a significant challenge in drug development. Over the past decade, advances in genomic profiling, database curation, and machine learning techniques have enabled more accurate identification of drug repurposing candidates for subsequent clinical evaluation. This review outlines the major methodologic classes that these approaches comprise, which rely on (a) protein structure, (b) genomic signatures, (c) biological networks, and (d) real-world clinical data. We propose that realizing the full impact of drug repurposing methodologies requires a multidisciplinary understanding of each method's advantages and limitations with respect to clinical practice.

5.
Cell Rep ; 42(12): 113544, 2023 12 26.
Article in English | MEDLINE | ID: mdl-38060381

ABSTRACT

Dysregulated iron or Ca2+ homeostasis has been reported in Parkinson's disease (PD) models. Here, we discover a connection between these two metals at the mitochondria. Elevation of iron levels causes inward mitochondrial Ca2+ overflow, through an interaction of Fe2+ with mitochondrial calcium uniporter (MCU). In PD neurons, iron accumulation-triggered Ca2+ influx across the mitochondrial surface leads to spatially confined Ca2+ elevation at the outer mitochondrial membrane, which is subsequently sensed by Miro1, a Ca2+-binding protein. A Miro1 blood test distinguishes PD patients from controls and responds to drug treatment. Miro1-based drug screens in PD cells discover Food and Drug Administration-approved T-type Ca2+-channel blockers. Human genetic analysis reveals enrichment of rare variants in T-type Ca2+-channel subtypes associated with PD status. Our results identify a molecular mechanism in PD pathophysiology and drug targets and candidates coupled with a convenient stratification method.


Subject(s)
Calcium , Parkinson Disease , Humans , Calcium/metabolism , Parkinson Disease/drug therapy , Parkinson Disease/genetics , Parkinson Disease/metabolism , Pharmaceutical Preparations/metabolism , Iron/metabolism , Mitochondria/metabolism
6.
bioRxiv ; 2023 Oct 16.
Article in English | MEDLINE | ID: mdl-37905033

ABSTRACT

The rapid expansion of protein sequence and structure databases has resulted in a significant number of proteins with ambiguous or unknown function. While advances in machine learning techniques hold great potential to fill this annotation gap, current methods for function prediction are unable to associate global function reliably to the specific residues responsible for that function. We address this issue by introducing PARSE (Protein Annotation by Residue-Specific Enrichment), a knowledge-based method which combines pre-trained embeddings of local structural environments with traditional statistical techniques to identify enriched functions with residue-level explainability. For the task of predicting the catalytic function of enzymes, PARSE achieves comparable or superior global performance to state-of-the-art machine learning methods (F1 score > 85%) while simultaneously annotating the specific residues involved in each function with much greater precision. Since it does not require supervised training, our method can make one-shot predictions for very rare functions and is not limited to a particular type of functional label (e.g. Enzyme Commission numbers or Gene Ontology codes). Finally, we leverage the AlphaFold Structure Database to perform functional annotation at a proteome scale. By applying PARSE to the dark proteome-predicted structures which cannot be classified into known structural families-we predict several novel bacterial metalloproteases. Each of these proteins shares a strongly conserved catalytic site despite highly divergent sequences and global folds, illustrating the value of local structure representations for new function discovery.

7.
Nat Genet ; 55(11): 1876-1891, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37857935

ABSTRACT

Noncoding variants of presumed regulatory function contribute to the heritability of neuropsychiatric disease. A total of 2,221 noncoding variants connected to risk for ten neuropsychiatric disorders, including autism spectrum disorder, attention deficit hyperactivity disorder, bipolar disorder, borderline personality disorder, major depression, generalized anxiety disorder, panic disorder, post-traumatic stress disorder, obsessive-compulsive disorder and schizophrenia, were studied in developing human neural cells. Integrating epigenomic and transcriptomic data with massively parallel reporter assays identified differentially-active single-nucleotide variants (daSNVs) in specific neural cell types. Expression-gene mapping, network analyses and chromatin looping nominated candidate disease-relevant target genes modulated by these daSNVs. Follow-up integration of daSNV gene editing with clinical cohort analyses suggested that magnesium transport dysfunction may increase neuropsychiatric disease risk and indicated that common genetic pathomechanisms may mediate specific symptoms that are shared across multiple neuropsychiatric diseases.


Subject(s)
Attention Deficit Disorder with Hyperactivity , Autism Spectrum Disorder , Bipolar Disorder , Depressive Disorder, Major , Obsessive-Compulsive Disorder , Schizophrenia , Humans , Autism Spectrum Disorder/genetics , Bipolar Disorder/genetics , Schizophrenia/genetics , Obsessive-Compulsive Disorder/genetics , Obsessive-Compulsive Disorder/psychology , Depressive Disorder, Major/genetics , Attention Deficit Disorder with Hyperactivity/genetics
8.
Sci Transl Med ; 15(713): eadi0336, 2023 09 13.
Article in English | MEDLINE | ID: mdl-37703349

ABSTRACT

Regulatory agencies need to ensure the safety and equity of AI in biomedicine, and the time to do so is now.

9.
N Engl J Med ; 389(15): 1431-1434, 2023 Oct 12.
Article in English | MEDLINE | ID: mdl-37732608
10.
Am J Hum Genet ; 110(9): 1522-1533, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37607538

ABSTRACT

Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results.


Subject(s)
Biological Specimen Banks , Data Science , Humans , Phenomics , Phenotype , Genotype
11.
J Biomed Inform ; 145: 104474, 2023 09.
Article in English | MEDLINE | ID: mdl-37572825

ABSTRACT

Inferring knowledge from known relationships between drugs, proteins, genes, and diseases has great potential for clinical impact, such as predicting which existing drugs could be repurposed to treat rare diseases. Incorporating key biological context such as cell type or tissue of action into representations of extracted biomedical knowledge is essential for principled pharmacological discovery. Existing global, literature-derived knowledge graphs of interactions between drugs, proteins, genes, and diseases lack this essential information. In this study, we frame the task of associating biological context with protein-protein interactions extracted from text as a classification task using syntactic, semantic, and novel meta-discourse features. We introduce the Insider corpora, which are automatically generated PubMed-scale corpora for training classifiers for the context association task. These corpora are created by searching for precise syntactic cues of cell type and tissue relevancy to extracted regulatory relations. We report F1 scores of 0.955 and 0.862 for identifying relevant cell types and tissues, respectively, for our identified relations. By classifying with this framework, we demonstrate that the problem of context association can be addressed using intuitive, interpretable features. We demonstrate the potential of this approach to enrich text-derived knowledge bases with biological detail by incorporating cell type context into a protein-protein network for dengue fever.


Subject(s)
Data Mining , Knowledge Bases , Humans , PubMed , Rare Diseases
12.
Camb Prism Precis Med ; 1: e18, 2023.
Article in English | MEDLINE | ID: mdl-37560024

ABSTRACT

Pharmacogenetics, the study of how interindividual genetic differences affect drug response, does not explain all observed heritable variance in drug response. Epigenetic mechanisms, such as DNA methylation, and histone acetylation may account for some of the unexplained variances. Epigenetic mechanisms modulate gene expression and can be suitable drug targets and can impact the action of nonepigenetic drugs. Pharmacoepigenetics is the field that studies the relationship between epigenetic variability and drug response. Much of this research focuses on compounds targeting epigenetic mechanisms, called epigenetic drugs, which are used to treat cancers, immune disorders, and other diseases. Several studies also suggest an epigenetic role in classical drug response; however, we know little about this area. The amount of information correlating epigenetic biomarkers to molecular datasets has recently expanded due to technological advances, and novel computational approaches have emerged to better identify and predict epigenetic interactions. We propose that the relationship between epigenetics and classical drug response may be examined using data already available by (1) finding regions of epigenetic variance, (2) pinpointing key epigenetic biomarkers within these regions, and (3) mapping these biomarkers to a drug-response phenotype. This approach expands on existing knowledge to generate putative pharmacoepigenetic relationships, which can be tested experimentally. Epigenetic modifications are involved in disease and drug response. Therefore, understanding how epigenetic drivers impact the response to classical drugs is important for improving drug design and administration to better treat disease.

13.
Transl Vis Sci Technol ; 12(8): 8, 2023 08 01.
Article in English | MEDLINE | ID: mdl-37561511

ABSTRACT

Purpose: The genetic architecture of corneal dysfunction remains poorly understood. Epidemiological and clinical evidence suggests a relationship between corneal structural features and anthropometric measures. We used global and local genetic similarity analysis to identify genomic features that may underlie structural corneal dysfunction. Methods: We assembled genome-wide association study summary statistics for corneal features (central corneal thickness, corneal hysteresis [CH], corneal resistance factor [CRF], and the 3 mm index of keratometry) and anthropometric traits (body mass index, weight, and height) in Europeans. We calculated global genetic correlations (rg) between traits using linkage disequilibrium (LD) score regression and local genetic covariance using ρ-HESS, which partitions the genome and performs regression with LD regions. Finally, we identified genes located within regions of significant genetic covariance and analyzed patterns of tissue expression and pathway enrichment. Results: Global LD score regression revealed significant negative correlations between height and both CH (rg = -0.12; P = 2.0 × 10-7) and CRF (rg = -0.11; P = 6.9 × 10-7). Local analysis revealed 68 genomic regions exhibiting significant local genetic covariance between CRF and height, containing 2874 unique genes. Pathway analysis of genes in regions with significant local rg revealed enrichment among signaling pathways with known keratoconus associations, including cadherin and Wnt signaling, as well as enrichment of genes modulated by copper and zinc ions. Conclusions: Corneal biophysical parameters and height share a common genomic architecture, which may facilitate identification of disease-associated genes and therapies for corneal ectasias. Translational Relevance: Local genetic covariance analysis enables the identification of associated genes and therapeutic targets for corneal ectatic disease.


Subject(s)
Genome-Wide Association Study , Keratoconus , Humans , Cornea , Keratoconus/metabolism , Physical Examination
14.
Cell Rep Methods ; 3(7): 100503, 2023 07 24.
Article in English | MEDLINE | ID: mdl-37529368

ABSTRACT

We demonstrate that integrative analysis of CRISPR screening datasets enables network-based prioritization of prescription drugs modulating viral entry in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by developing a network-based approach called Rapid proXimity Guidance for Repurposing Investigational Drugs (RxGRID). We use our results to guide a propensity-score-matched, retrospective cohort study of 64,349 COVID-19 patients, showing that a top candidate drug, spironolactone, is associated with improved clinical prognosis, measured by intensive care unit (ICU) admission and mechanical ventilation rates. Finally, we show that spironolactone exerts a dose-dependent inhibitory effect on viral entry in human lung epithelial cells. Our RxGRID method presents a computational framework, implemented as an open-source software package, enabling genomics researchers to identify drugs likely to modulate a molecular phenotype of interest based on high-throughput screening data. Our results, derived from this method and supported by experimental and clinical analysis, add additional supporting evidence for a potential protective role of the potassium-sparing diuretic spironolactone in severe COVID-19.


Subject(s)
COVID-19 , Humans , SARS-CoV-2/genetics , Spironolactone/pharmacology , Retrospective Studies , Genomics
15.
Am J Ophthalmol ; 255: 161-169, 2023 11.
Article in English | MEDLINE | ID: mdl-37490992

ABSTRACT

PURPOSE: To develop an automated deep learning system for detecting the presence and location of disc hemorrhages in optic disc photographs. DESIGN: Development and testing of a deep learning algorithm. METHODS: Optic disc photos (597 images with at least 1 disc hemorrhage and 1075 images without any disc hemorrhage from 1562 eyes) from 5 institutions were classified by expert graders based on the presence or absence of disc hemorrhage. The images were split into training (n = 1340), validation (n = 167), and test (n = 165) datasets. Two state-of-the-art deep learning algorithms based on either object-level detection or image-level classification were trained on the dataset. These models were compared to one another and against 2 independent glaucoma specialists. We evaluated model performance by the area under the receiver operating characteristic curve (AUC). AUCs were compared with the Hanley-McNeil method. RESULTS: The object detection model achieved an AUC of 0.936 (95% CI = 0.857-0.964) across all held-out images (n = 165 photographs), which was significantly superior to the image classification model (AUC = 0.845, 95% CI = 0.740-0.912; P = .006). At an operating point selected for high specificity, the model achieved a specificity of 94.3% and a sensitivity of 70.0%, which was statistically indistinguishable from an expert clinician (P = .7). At an operating point selected for high sensitivity, the model achieves a sensitivity of 96.7% and a specificity of 73.3%. CONCLUSIONS: An autonomous object detection model is superior to an image classification model for detecting disc hemorrhages, and performed comparably to 2 clinicians.


Subject(s)
Deep Learning , Glaucoma , Optic Disk , Optic Nerve Diseases , Humans , Optic Disk/diagnostic imaging , Optic Nerve Diseases/diagnosis , Glaucoma/diagnosis , ROC Curve , Algorithms , Retinal Hemorrhage/diagnosis
16.
medRxiv ; 2023 Mar 02.
Article in English | MEDLINE | ID: mdl-36909470

ABSTRACT

Background: Spironolactone has been proposed as a potential modulator of SARS-CoV-2 cellular entry. We aimed to measure the effect of spironolactone use on the risk of adverse outcomes following COVID-19 hospitalization. Methods: We performed a retrospective cohort study of COVID-19 outcomes for patients with or without exposure to spironolactone, using population-scale claims data from the Komodo Healthcare Map. We identified all patients with a hospital admission for COVID-19 in the study window, defining treatment status based on spironolactone prescription orders. The primary outcomes were progression to respiratory ventilation or mortality during the hospitalization. Odds ratios (OR) were estimated following either 1:1 propensity score matching (PSM) or multivariable regression. Subgroup analysis was performed based on age, gender, body mass index (BMI), and dominant SARS-CoV-2 variant. Findings: Among 898,303 eligible patients with a COVID-19-related hospitalization, 16,324 patients (1.8%) had a spironolactone prescription prior to hospitalization. 59,937 patients (6.7%) met the ventilation endpoint, and 26,515 patients (3.0%) met the mortality endpoint. Spironolactone use was associated with a significant reduction in odds of both ventilation (OR 0.82; 95% CI: 0.75-0.88; p < 0.001) and mortality (OR 0.88; 95% CI: 0.78-0.99; p = 0.033) in the PSM analysis, supported by the regression analysis. Spironolactone use was associated with significantly reduced odds of ventilation for all age groups, men, women, and non-obese patients, with the greatest protective effects in younger patients, men, and non-obese patients. Interpretation: Spironolactone use was associated with a protective effect against ventilation and mortality following COVID-19 infection, amounting to up to 64% of the protective effect of vaccination against ventilation and consistent with an androgen-dependent mechanism. The findings warrant initiation of large-scale randomized controlled trials to establish a potential therapeutic role for spironolactone in COVID-19 patients.

17.
Biomolecules ; 13(2)2023 02 18.
Article in English | MEDLINE | ID: mdl-36830756

ABSTRACT

Drug abuse is a serious problem in the United States, with over 90,000 drug overdose deaths nationally in 2020. A key step in combating drug abuse is detecting, monitoring, and characterizing its trends over time and location, also known as pharmacovigilance. While federal reporting systems accomplish this to a degree, they often have high latency and incomplete coverage. Social-media-based pharmacovigilance has zero latency, is easily accessible and unfiltered, and benefits from drug users being willing to share their experiences online pseudo-anonymously. However, unlike highly structured official data sources, social media text is rife with misspellings and slang, making automated analysis difficult. Generative Pretrained Transformer 3 (GPT-3) is a large autoregressive language model specialized for few-shot learning that was trained on text from the entire internet. We demonstrate that GPT-3 can be used to generate slang and common misspellings of terms for drugs of abuse. We repeatedly queried GPT-3 for synonyms of drugs of abuse and filtered the generated terms using automated Google searches and cross-references to known drug names. When generated terms for alprazolam were manually labeled, we found that our method produced 269 synonyms for alprazolam, 221 of which were new discoveries not included in an existing drug lexicon for social media. We repeated this process for 98 drugs of abuse, of which 22 are widely-discussed drugs of abuse, building a lexicon of colloquial drug synonyms that can be used for pharmacovigilance on social media.


Subject(s)
Social Media , Substance-Related Disorders , United States , Humans , Pharmacovigilance , Alprazolam , Natural Language Processing
18.
Nat Commun ; 14(1): 738, 2023 02 10.
Article in English | MEDLINE | ID: mdl-36759510

ABSTRACT

Existing annotation paradigms rely on controlled vocabularies, where each data instance is classified into one term from a predefined set of controlled vocabularies. This paradigm restricts the analysis to concepts that are known and well-characterized. Here, we present the novel multilingual translation method BioTranslator to address this problem. BioTranslator takes a user-written textual description of a new concept and then translates this description to a non-text biological data instance. The key idea of BioTranslator is to develop a multilingual translation framework, where multiple modalities of biological data are all translated to text. We demonstrate how BioTranslator enables the identification of novel cell types using only a textual description and how BioTranslator can be further generalized to protein function prediction and drug target identification. Our tool frees scientists from limiting their analyses within predefined controlled vocabularies, enabling them to interact with biological data using free text.


Subject(s)
Multilingualism , Vocabulary, Controlled , Proteins
19.
medRxiv ; 2023 Jan 19.
Article in English | MEDLINE | ID: mdl-36712099

ABSTRACT

The case-control study is a widely used method for investigating the genetic landscape of binary traits. However, the health-related outcome or disease status of participants in long-term, prospective cohort studies such as the UK Biobank are subject to change. Here, we develop an approach for the genetic association study leveraging disease liabilities computed from a deep patient phenotyping framework (AI-based liability). Analyzing 44 common traits in 261,807 participants from the UK Biobank, we identified novel loci compared to the conventional case-control (CC) association studies. Our results showed that combining liability scores with CC status was more powerful than the CC-GWAS in detecting independent genetic loci across different diseases. This boost in statistical power was further reflected in increased SNP-based heritability estimates. Moreover, polygenic risk scores calculated from AI-based liabilities better identified newly diagnosed cases in the 2022 release of the UK Biobank that served as controls in the 2019 version (6.2% percentile rank increase on average). These findings demonstrate the utility of deep neural networks that are able to model disease liabilities from high-dimensional phenotypic data in large-scale population cohorts. Our pipeline of genome-wide association studies with disease liabilities can be applied to other biobanks with rich phenotype and genotype data.

20.
iScience ; 26(1): 105802, 2023 Jan 20.
Article in English | MEDLINE | ID: mdl-36636354

ABSTRACT

Non-alcoholic fatty liver disease is a heterogeneous disease with unclear underlying molecular mechanisms. Here, we perform single-cell RNA sequencing of hepatocytes and hepatic non-parenchymal cells to map the lipid signatures in mice with non-alcoholic fatty liver disease (NAFLD). We uncover previously unidentified clusters of hepatocytes characterized by either high or low srebp1 expression. Surprisingly, the canonical lipid synthesis driver Srebp1 is not predictive of hepatic lipid accumulation, suggestive of other drivers of lipid metabolism. By combining transcriptional data at single-cell resolution with computational network analyses, we find that NAFLD is associated with high constitutive androstane receptor (CAR) expression. Mechanistically, CAR interacts with four functional modules: cholesterol homeostasis, bile acid metabolism, fatty acid metabolism, and estrogen response. Nuclear expression of CAR positively correlates with steatohepatitis in human livers. These findings demonstrate significant cellular differences in lipid signatures and identify functional networks linked to hepatic steatosis in mice and humans.

SELECTION OF CITATIONS
SEARCH DETAIL
...