Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 78
Filter
1.
medRxiv ; 2024 Mar 08.
Article in English | MEDLINE | ID: mdl-38496488

ABSTRACT

Optimal treatments depend on numerous factors such as drug chemical properties, disease biology, and patient characteristics to which the treatment is applied. To realize the promise of AI in healthcare, there is a need for designing systems that can capture patient heterogeneity and relevant biomedical knowledge. Here we present PlaNet, a geometric deep learning framework that reasons over population variability, disease biology, and drug chemistry by representing knowledge in the form of a massive clinical knowledge graph that can be enhanced by language models. Our framework is applicable to any sub-population, any drug as well drug combinations, any disease, and to a wide range of pharmacological tasks. We apply the PlaNet framework to reason about outcomes of clinical trials: PlaNet predicts drug efficacy and adverse events, even for experimental drugs and their combinations that have never been seen by the model. Furthermore, PlaNet can estimate the effect of changing population on the trial outcome with direct implications on patient stratification in clinical trials. PlaNet takes fundamental steps towards AI-guided clinical trials design, offering valuable guidance for realizing the vision of precision medicine using AI.

2.
Nat Methods ; 2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38366243

ABSTRACT

Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, interspecies genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN can detect functionally related genes coexpressed across species, redefining differential expression for cross-species analysis. Applying SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets, we show that SATURN can effectively transfer annotations across species, even when they are evolutionarily remote. We also demonstrate that SATURN can be used to find potentially divergent gene functions between glaucoma-associated genes in humans and four other species.

3.
Nature ; 624(7992): 586-592, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38030732

ABSTRACT

A long-standing expectation is that large, dense and cosmopolitan areas support socioeconomic mixing and exposure among diverse individuals1-6. Assessing this hypothesis has been difficult because previous measures of socioeconomic mixing have relied on static residential housing data rather than real-life exposures among people at work, in places of leisure and in home neighbourhoods7,8. Here we develop a measure of exposure segregation that captures the socioeconomic diversity of these everyday encounters. Using mobile phone mobility data to represent 1.6 billion real-world exposures among 9.6 million people in the United States, we measure exposure segregation across 382 metropolitan statistical areas (MSAs) and 2,829 counties. We find that exposure segregation is 67% higher in the ten largest MSAs than in small MSAs with fewer than 100,000 residents. This means that, contrary to expectations, residents of large cosmopolitan areas have less exposure to a socioeconomically diverse range of individuals. Second, we find that the increased socioeconomic segregation in large cities arises because they offer a greater choice of differentiated spaces targeted to specific socioeconomic groups. Third, we find that this segregation-increasing effect is countered when a city's hubs (such as shopping centres) are positioned to bridge diverse neighbourhoods and therefore attract people of all socioeconomic statuses. Our findings challenge a long-standing conjecture in human geography and highlight how urban design can both prevent and facilitate encounters among diverse individuals.


Subject(s)
Cities , Social Network Analysis , Social Networking , Socioeconomic Factors , Urban Population , Humans , Cell Phone , Cities/statistics & numerical data , Housing/statistics & numerical data , Models, Theoretical , Residence Characteristics/statistics & numerical data , United States , Urban Population/statistics & numerical data
4.
Nature ; 620(7972): 47-60, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37532811

ABSTRACT

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.


Subject(s)
Artificial Intelligence , Research Design , Artificial Intelligence/standards , Artificial Intelligence/trends , Datasets as Topic , Deep Learning , Research Design/standards , Research Design/trends , Unsupervised Machine Learning
6.
Nat Biotechnol ; 2023 Aug 17.
Article in English | MEDLINE | ID: mdl-37592036

ABSTRACT

Understanding cellular responses to genetic perturbation is central to numerous biomedical applications, from identifying genetic interactions involved in cancer to developing methods for regenerative medicine. However, the combinatorial explosion in the number of possible multigene perturbations severely limits experimental interrogation. Here, we present graph-enhanced gene activation and repression simulator (GEARS), a method that integrates deep learning with a knowledge graph of gene-gene relationships to predict transcriptional responses to both single and multigene perturbations using single-cell RNA-sequencing data from perturbational screens. GEARS is able to predict outcomes of perturbing combinations consisting of genes that were never experimentally perturbed. GEARS exhibited 40% higher precision than existing approaches in predicting four distinct genetic interaction subtypes in a combinatorial perturbation screen and identified the strongest interactions twice as well as prior approaches. Overall, GEARS can predict phenotypically distinct effects of multigene perturbations and thus guide the design of perturbational experiments.

7.
Nature ; 619(7970): 572-584, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37468586

ABSTRACT

The intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health1. The intesting has a length of over nine metres, along which there are differences in structure and function2. The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function. Here, to better understand these differences, we evaluated the organization of single cells using multiplexed imaging and single-nucleus RNA and open chromatin assays across eight different intestinal sites from nine donors. Through systematic analyses, we find cell compositions that differ substantially across regions of the intestine and demonstrate the complexity of epithelial subtypes, and find that the same cell types are organized into distinct neighbourhoods and communities, highlighting distinct immunological niches that are present in the intestine. We also map gene regulatory differences in these cells that are suggestive of a regulatory differentiation cascade, and associate intestinal disease heritability with specific cell types. These results describe the complexity of the cell composition, regulation and organization for this organ, and serve as an important reference map for understanding human biology and disease.


Subject(s)
Intestines , Single-Cell Analysis , Humans , Cell Differentiation/genetics , Chromatin/genetics , Epithelial Cells/cytology , Epithelial Cells/metabolism , Gene Expression Regulation , Intestinal Mucosa/cytology , Intestines/cytology , Intestines/immunology , Single-Cell Gene Expression Analysis
8.
Science ; 380(6650): eadg0934, 2023 06 16.
Article in English | MEDLINE | ID: mdl-37319212

ABSTRACT

Aging is characterized by a decline in tissue function, but the underlying changes at cellular resolution across the organism remain unclear. Here, we present the Aging Fly Cell Atlas, a single-nucleus transcriptomic map of the whole aging Drosophila. We characterized 163 distinct cell types and performed an in-depth analysis of changes in tissue cell composition, gene expression, and cell identities. We further developed aging clock models to predict fly age and show that ribosomal gene expression is a conserved predictive factor for age. Combining all aging features, we find distinctive cell type-specific aging patterns. This atlas provides a valuable resource for studying fundamental principles of aging in complex organisms.


Subject(s)
Aging , Cellular Senescence , Drosophila melanogaster , Animals , Aging/genetics , Gene Expression Profiling , Transcriptome , Drosophila melanogaster/cytology , Drosophila melanogaster/genetics , Drosophila melanogaster/physiology , Atlases as Topic
9.
J Biomed Inform ; 143: 104407, 2023 07.
Article in English | MEDLINE | ID: mdl-37271308

ABSTRACT

OBJECTIVE: To determine whether graph neural network based models of electronic health records can predict specialty consultation care needs for endocrinology and hematology more accurately than the standard of care checklists and other conventional medical recommendation algorithms in the literature. METHODS: Demand for medical expertise far outstrips supply, with tens of millions in the US alone with deficient access to specialty care. Rather than potentially months long delays to initiate diagnostic workup and medical treatment with a specialist, referring primary care supported by an automated recommender algorithm could anticipate and directly initiate patient evaluation that would otherwise be needed at subsequent a specialist appointment. We propose a novel graph representation learning approach with a heterogeneous graph neural network to model structured electronic health records and formulate recommendation/prediction of subsequent specialist orders as a link prediction problem. RESULTS: Models are trained and assessed in two specialty care sites: endocrinology and hematology. Our experimental results show that our model achieves an 8% improvement in ROC-AUC for endocrinology (ROC-AUC = 0.88) and 5% improvement for hematology (ROC-AUC = 0.84) personalized procedure recommendations over prior medical recommender systems. These recommender algorithm approaches provide medical procedure recommendations for endocrinology referrals more effectively than manual clinical checklists (recommender: precision = 0.60, recall = 0.27, F1-score = 0.37) vs. (checklist: precision = 0.16, recall = 0.28, F1-score = 0.20), and similarly for hematology referrals (recommender: precision = 0.44, recall = 0.38, F1-score = 0.41) vs. (checklist: precision = 0.27, recall = 0.71, F1-score = 0.39). CONCLUSION: Embedding graph neural network models into clinical care can improve digital specialty consultation systems and expand the access to medical experience of prior similar cases.


Subject(s)
Algorithms , Neural Networks, Computer , Humans , Electronic Health Records , Referral and Consultation , Endocrinology , Hematology
10.
Nature ; 616(7956): 259-265, 2023 04.
Article in English | MEDLINE | ID: mdl-37045921

ABSTRACT

The exceptionally rapid development of highly flexible, reusable artificial intelligence (AI) models is likely to usher in newfound capabilities in medicine. We propose a new paradigm for medical AI, which we refer to as generalist medical AI (GMAI). GMAI models will be capable of carrying out a diverse set of tasks using very little or no task-specific labelled data. Built through self-supervision on large, diverse datasets, GMAI will flexibly interpret different combinations of medical modalities, including data from imaging, electronic health records, laboratory results, genomics, graphs or medical text. Models will in turn produce expressive outputs such as free-text explanations, spoken recommendations or image annotations that demonstrate advanced medical reasoning abilities. Here we identify a set of high-impact potential applications for GMAI and lay out specific technical capabilities and training datasets necessary to enable them. We expect that GMAI-enabled applications will challenge current strategies for regulating and validating AI devices for medicine and will shift practices associated with the collection of large medical datasets.


Subject(s)
Artificial Intelligence , Medicine , Diagnostic Imaging , Electronic Health Records , Genomics , Datasets as Topic , Unsupervised Machine Learning , Humans
11.
bioRxiv ; 2023 Sep 24.
Article in English | MEDLINE | ID: mdl-36778387

ABSTRACT

Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, inter-species genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here, we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN has a unique ability to detect functionally related genes co-expressed across species, redefining differential expression for cross-species analysis. We apply SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets. We show that cell embeddings learnt in SATURN can be effectively used to transfer annotations across species and identify both homologous and species-specific cell types, even across evolutionarily remote species. Finally, we use SATURN to reannotate the five species Cell Atlas of Human Trabecular Meshwork and Aqueous Outflow Structures and find evidence of potentially divergent functions between glaucoma associated genes in humans and other species.

12.
Pac Symp Biocomput ; 28: 61-72, 2023.
Article in English | MEDLINE | ID: mdl-36540965

ABSTRACT

Biological networks are powerful representations for the discovery of molecular phenotypes. Fundamental to network analysis is the principle-rooted in social networks-that nodes that interact in the network tend to have similar properties. While this long-standing principle underlies powerful methods in biology that associate molecules with phenotypes on the basis of network proximity, interacting molecules are not necessarily similar, and molecules with similar properties do not necessarily interact. Here, we show that molecules are more likely to have similar phenotypes, not if they directly interact in a molecular network, but if they interact with the same molecules. We call this the mutual interactor principle and show that it holds for several kinds of molecular networks, including protein-protein interaction, genetic interaction, and signaling networks. We then develop a machine learning framework for predicting molecular phenotypes on the basis of mutual interactors. Strikingly, the framework can predict drug targets, disease proteins, and protein functions in different species, and it performs better than much more complex algorithms. The framework is robust to incomplete biological data and is capable of generalizing to phenotypes it has not seen during training. Our work represents a network-based predictive platform for phenotypic characterization of biological molecules.


Subject(s)
Algorithms , Computational Biology , Computational Biology/methods , Proteins/metabolism , Machine Learning , Phenotype , Protein Interaction Maps
13.
Nat Methods ; 19(11): 1411-1418, 2022 11.
Article in English | MEDLINE | ID: mdl-36280720

ABSTRACT

Accurate cell-type annotation from spatially resolved single cells is crucial to understand functional spatial biology that is the basis of tissue organization. However, current computational methods for annotating spatially resolved single-cell data are typically based on techniques established for dissociated single-cell technologies and thus do not take spatial organization into account. Here we present STELLAR, a geometric deep learning method for cell-type discovery and identification in spatially resolved single-cell datasets. STELLAR automatically assigns cells to cell types present in the annotated reference dataset and discovers novel cell types and cell states. STELLAR transfers annotations across different dissection regions, different tissues and different donors, and learns cell representations that capture higher-order tissue structures. We successfully applied STELLAR to CODEX multiplexed fluorescent microscopy data and multiplexed RNA imaging datasets. Within the Human BioMolecular Atlas Program, STELLAR has annotated 2.6 million spatially resolved single cells with dramatic time savings.


Subject(s)
Single-Cell Analysis , Humans , Microscopy, Fluorescence
15.
Science ; 375(6584): eabk2432, 2022 03 04.
Article in English | MEDLINE | ID: mdl-35239393

ABSTRACT

For more than 100 years, the fruit fly Drosophila melanogaster has been one of the most studied model organisms. Here, we present a single-cell atlas of the adult fly, Tabula Drosophilae, that includes 580,000 nuclei from 15 individually dissected sexed tissues as well as the entire head and body, annotated to >250 distinct cell types. We provide an in-depth analysis of cell type-related gene signatures and transcription factor markers, as well as sexual dimorphism, across the whole animal. Analysis of common cell types between tissues, such as blood and muscle cells, reveals rare cell types and tissue-specific subtypes. This atlas provides a valuable resource for the Drosophila community and serves as a reference to study genetic perturbations and disease models at single-cell resolution.


Subject(s)
Drosophila melanogaster/cytology , Drosophila melanogaster/genetics , Transcriptome , Animals , Cell Nucleus/metabolism , Databases, Genetic , Drosophila Proteins/genetics , Drosophila melanogaster/physiology , Female , Gene Expression Regulation , Gene Regulatory Networks , Genes, Insect , Male , RNA-Seq , Sex Characteristics , Single-Cell Analysis , Transcription Factors/genetics
16.
Nat Commun ; 13(1): 267, 2022 01 18.
Article in English | MEDLINE | ID: mdl-35042849

ABSTRACT

An unhealthy diet is a major risk factor for chronic diseases including cardiovascular disease, type 2 diabetes, and cancer1-4. Limited access to healthy food options may contribute to unhealthy diets5,6. Studying diets is challenging, typically restricted to small sample sizes, single locations, and non-uniform design across studies, and has led to mixed results on the impact of the food environment7-23. Here we leverage smartphones to track diet health, operationalized through the self-reported consumption of fresh fruits and vegetables, fast food and soda, as well as body-mass index status in a country-wide observational study of 1,164,926 U.S. participants (MyFitnessPal app users) and 2.3 billion food entries to study the independent contributions of fast food and grocery store access, income and education to diet health outcomes. This study constitutes the largest nationwide study examining the relationship between the food environment and diet to date. We find that higher access to grocery stores, lower access to fast food, higher income and college education are independently associated with higher consumption of fresh fruits and vegetables, lower consumption of fast food and soda, and lower likelihood of being affected by overweight and obesity. However, these associations vary significantly across zip codes with predominantly Black, Hispanic or white populations. For instance, high grocery store access has a significantly larger association with higher fruit and vegetable consumption in zip codes with predominantly Hispanic populations (7.4% difference) and Black populations (10.2% difference) in contrast to zip codes with predominantly white populations (1.7% difference). Policy targeted at improving food access, income and education may increase healthy eating, but intervention allocation may need to be optimized for specific subpopulations and locations.


Subject(s)
Diet , Residence Characteristics , Body Mass Index , Cross-Sectional Studies , Diabetes Mellitus, Type 2/epidemiology , Diet/statistics & numerical data , Food Supply , Fruit , Humans , Income , Obesity , Risk Factors , Socioeconomic Factors , United States/epidemiology , Vegetables
17.
EPJ Data Sci ; 10(1): 57, 2021.
Article in English | MEDLINE | ID: mdl-34966638

ABSTRACT

In this paper we analyze the effect of shocks in production networks. Our work is based on a rich dataset that contains information about companies from Slovenia right after the financial crisis of 2008. The processed data spans for 8 years and covers the transaction history as well as performance indicators and various metadata of the companies. We define sales shocks at different levels, and identify companies impacted by them. Next we investigate stress, the potential immediate upstream and downstream impact of a shock within the production network. We base our main findings on a matched pairs analysis of stressed companies. We find that both shock and stress are associated with reporting bankruptcy in the future and that stress foremost impacts the future sales of customers. Furthermore, we find evidence that stress not only results in performance losses but the reconfiguration of the production network as well. We show that stressed companies actively seek for new trading partners, and that these new links often share the industry of the shocked company. These results suggest that both stressed customers and suppliers react quickly to stress and adjust their trading relationships.

18.
Nat Commun ; 12(1): 5556, 2021 09 21.
Article in English | MEDLINE | ID: mdl-34548483

ABSTRACT

Single cell technologies are rapidly generating large amounts of data that enables us to understand biological systems at single-cell resolution. However, joint analysis of datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types that are part of the controlled vocabulary that forms the Cell Ontology. A key advantage of OnClass is its capability to classify cells into cell types not present in the training data because it uses the Cell Ontology graph to infer cell type relationships. Furthermore, OnClass can be used to identify marker genes for all the cell ontology categories, regardless of whether the cell types are present or absent in the training data, suggesting that OnClass goes beyond a simple annotation tool for single cell datasets, being the first algorithm capable to identify marker genes specific to all terms of the Cell Ontology and offering the possibility of refining the Cell Ontology using a data-centric approach.


Subject(s)
Cell Lineage/genetics , Eukaryotic Cells/classification , Software , Terminology as Topic , Vocabulary, Controlled , Algorithms , Animals , Biomarkers/metabolism , Datasets as Topic , Gene Expression , Humans
19.
Proc Natl Acad Sci U S A ; 118(38)2021 09 21.
Article in English | MEDLINE | ID: mdl-34526401

ABSTRACT

Deceased public figures are often said to live on in collective memory. We quantify this phenomenon by tracking mentions of 2,362 public figures in English-language online news and social media (Twitter) 1 y before and after death. We measure the sharp spike and rapid decay of attention following death and model collective memory as a composition of communicative and cultural memory. Clustering reveals four patterns of postmortem memory, and regression analysis shows that boosts in media attention are largest for premortem popular anglophones who died a young, unnatural death; that long-term boosts are smallest for leaders and largest for artists; and that, while both the news and Twitter are triggered by young and unnatural deaths, the news additionally curates collective memory when old persons or leaders die. Overall, we illuminate the age-old question of who is remembered by society, and the distinct roles of news and social media in collective memory formation.


Subject(s)
Mass Media/trends , Social Identification , Social Media/trends , Communication , Humans , Mass Gatherings , Memory , Sociological Factors
20.
Neuron ; 109(16): 2556-2572.e6, 2021 08 18.
Article in English | MEDLINE | ID: mdl-34197732

ABSTRACT

Neurological and psychiatric disorders are associated with pathological neural dynamics. The fundamental connectivity patterns of cell-cell communication networks that enable pathological dynamics to emerge remain unknown. Here, we studied epileptic circuits using a newly developed computational pipeline that leveraged single-cell calcium imaging of larval zebrafish and chronically epileptic mice, biologically constrained effective connectivity modeling, and higher-order motif-focused network analysis. We uncovered a novel functional cell type that preferentially emerged in the preseizure state, the superhub, that was unusually richly connected to the rest of the network through feedforward motifs, critically enhancing downstream excitation. Perturbation simulations indicated that disconnecting superhubs was significantly more effective in stabilizing epileptic circuits than disconnecting hub cells that were defined traditionally by connection count. In the dentate gyrus of chronically epileptic mice, superhubs were predominately modeled adult-born granule cells. Collectively, these results predict a new maximally selective and minimally invasive cellular target for seizure control.


Subject(s)
Cell Communication/physiology , Epilepsy/physiopathology , Neurons/physiology , Seizures/physiopathology , Animals , Dentate Gyrus/pathology , Dentate Gyrus/physiopathology , Nerve Net/physiopathology , Zebrafish
SELECTION OF CITATIONS
SEARCH DETAIL
...