Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 358
Filtrar
1.
Nat Commun ; 15(1): 5577, 2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38956082

RESUMO

Recent advances in single-cell immune profiling have enabled the simultaneous measurement of transcriptome and T cell receptor (TCR) sequences, offering great potential for studying immune responses at the cellular level. However, integrating these diverse modalities across datasets is challenging due to their unique data characteristics and technical variations. Here, to address this, we develop the multimodal generative model mvTCR to fuse modality-specific information across transcriptome and TCR into a shared representation. Our analysis demonstrates the added value of multimodal over unimodal approaches to capture antigen specificity. Notably, we use mvTCR to distinguish T cell subpopulations binding to SARS-CoV-2 antigens from bystander cells. Furthermore, when combined with reference mapping approaches, mvTCR can map newly generated datasets to extensive T cell references, facilitating knowledge transfer. In summary, we envision mvTCR to enable a scalable analysis of multimodal immune profiling data and advance our understanding of immune responses.


Assuntos
COVID-19 , Receptores de Antígenos de Linfócitos T , SARS-CoV-2 , Análise de Célula Única , Transcriptoma , Receptores de Antígenos de Linfócitos T/metabolismo , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos T/imunologia , Análise de Célula Única/métodos , Humanos , SARS-CoV-2/imunologia , SARS-CoV-2/genética , COVID-19/imunologia , COVID-19/virologia , Linfócitos T/imunologia , Linfócitos T/metabolismo , Perfilação da Expressão Gênica/métodos , Antígenos Virais/imunologia , Antígenos Virais/genética
2.
Nature ; 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38987588

RESUMO

Chronic hepatitis B virus (HBV) infection affects 300 million patients worldwide1,2, in whom virus-specific CD8 T cells by still ill-defined mechanisms lose their function and cannot eliminate HBV-infected hepatocytes3-7. Here we demonstrate that a liver immune rheostat renders virus-specific CD8 T cells refractory to activation and leads to their loss of effector functions. In preclinical models of persistent infection with hepatotropic viruses such as HBV, dysfunctional virus-specific CXCR6+ CD8 T cells accumulated in the liver and, as a characteristic hallmark, showed enhanced transcriptional activity of cAMP-responsive element modulator (CREM) distinct from T cell exhaustion. In patients with chronic hepatitis B, circulating and intrahepatic HBV-specific CXCR6+ CD8 T cells with enhanced CREM expression and transcriptional activity were detected at a frequency of 12-22% of HBV-specific CD8 T cells. Knocking out the inhibitory CREM/ICER isoform in T cells, however, failed to rescue T cell immunity. This indicates that CREM activity was a consequence, rather than the cause, of loss in T cell function, further supported by the observation of enhanced phosphorylation of protein kinase A (PKA) which is upstream of CREM. Indeed, we found that enhanced cAMP-PKA-signalling from increased T cell adenylyl cyclase activity augmented CREM activity and curbed T cell activation and effector function in persistent hepatic infection. Mechanistically, CD8 T cells recognizing their antigen on hepatocytes established close and extensive contact with liver sinusoidal endothelial cells, thereby enhancing adenylyl cyclase-cAMP-PKA signalling in T cells. In these hepatic CD8 T cells, which recognize their antigen on hepatocytes, phosphorylation of key signalling kinases of the T cell receptor signalling pathway was impaired, which rendered them refractory to activation. Thus, close contact with liver sinusoidal endothelial cells curbs the activation and effector function of HBV-specific CD8 T cells that target hepatocytes expressing viral antigens by means of the adenylyl cyclase-cAMP-PKA axis in an immune rheostat-like fashion.

3.
Genome Biol ; 25(1): 181, 2024 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-38978088

RESUMO

Single-cell multiomic analysis of the epigenome, transcriptome, and proteome allows for comprehensive characterization of the molecular circuitry that underpins cell identity and state. However, the holistic interpretation of such datasets presents a challenge given a paucity of approaches for systematic, joint evaluation of different modalities. Here, we present Panpipes, a set of computational workflows designed to automate multimodal single-cell and spatial transcriptomic analyses by incorporating widely-used Python-based tools to perform quality control, preprocessing, integration, clustering, and reference mapping at scale. Panpipes allows reliable and customizable analysis and evaluation of individual and integrated modalities, thereby empowering decision-making before downstream investigations.


Assuntos
Análise de Célula Única , Software , Transcriptoma , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Fluxo de Trabalho
4.
Nat Methods ; 21(7): 1196-1205, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38871986

RESUMO

Single-cell RNA sequencing allows us to model cellular state dynamics and fate decisions using expression similarity or RNA velocity to reconstruct state-change trajectories; however, trajectory inference does not incorporate valuable time point information or utilize additional modalities, whereas methods that address these different data views cannot be combined or do not scale. Here we present CellRank 2, a versatile and scalable framework to study cellular fate using multiview single-cell data of up to millions of cells in a unified fashion. CellRank 2 consistently recovers terminal states and fate probabilities across data modalities in human hematopoiesis and endodermal development. Our framework also allows combining transitions within and across experimental time points, a feature we use to recover genes promoting medullary thymic epithelial cell formation during pharyngeal endoderm development. Moreover, we enable estimating cell-specific transcription and degradation rates from metabolic-labeling data, which we apply to an intestinal organoid system to delineate differentiation trajectories and pinpoint regulatory strategies.


Assuntos
Diferenciação Celular , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Endoderma/citologia , Endoderma/metabolismo , Hematopoese , Linhagem da Célula , Análise de Sequência de RNA/métodos , Organoides/metabolismo , Organoides/citologia
5.
Genome Med ; 16(1): 80, 2024 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-38862979

RESUMO

The study of immunology, traditionally reliant on proteomics to evaluate individual immune cells, has been revolutionized by single-cell RNA sequencing. Computational immunologists play a crucial role in analysing these datasets, moving beyond traditional protein marker identification to encompass a more detailed view of cellular phenotypes and their functional roles. Recent technological advancements allow the simultaneous measurements of multiple cellular components-transcriptome, proteome, chromatin, epigenetic modifications and metabolites-within single cells, including in spatial contexts within tissues. This has led to the generation of complex multiscale datasets that can include multimodal measurements from the same cells or a mix of paired and unpaired modalities. Modern machine learning (ML) techniques allow for the integration of multiple "omics" data without the need for extensive independent modelling of each modality. This review focuses on recent advancements in ML integrative approaches applied to immunological studies. We highlight the importance of these methods in creating a unified representation of multiscale data collections, particularly for single-cell and spatial profiling technologies. Finally, we discuss the challenges of these holistic approaches and how they will be instrumental in the development of a common coordinate framework for multiscale studies, thereby accelerating research and enabling discoveries in the computational immunology field.


Assuntos
Biologia Computacional , Aprendizado de Máquina , Humanos , Biologia Computacional/métodos , Análise de Célula Única/métodos , Alergia e Imunologia , Animais , Imunoinformática
6.
Bioinformatics ; 40(Supplement_1): i548-i557, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940138

RESUMO

SUMMARY: Spatial omics technologies are increasingly leveraged to characterize how disease disrupts tissue organization and cellular niches. While multiple methods to analyze spatial variation within a sample have been published, statistical and computational approaches to compare cell spatial organization across samples or conditions are mostly lacking. We present GraphCompass, a comprehensive set of omics-adapted graph analysis methods to quantitatively evaluate and compare the spatial arrangement of cells in samples representing diverse biological conditions. GraphCompass builds upon the Squidpy spatial omics toolbox and encompasses various statistical approaches to perform cross-condition analyses at the level of individual cell types, niches, and samples. Additionally, GraphCompass provides custom visualization functions that enable effective communication of results. We demonstrate how GraphCompass can be used to address key biological questions, such as how cellular organization and tissue architecture differ across various disease states and which spatial patterns correlate with a given pathological condition. GraphCompass can be applied to various popular omics techniques, including, but not limited to, spatial proteomics (e.g. MIBI-TOF), spot-based transcriptomics (e.g. 10× Genomics Visium), and single-cell resolved transcriptomics (e.g. Stereo-seq). In this work, we showcase the capabilities of GraphCompass through its application to three different studies that may also serve as benchmark datasets for further method development. With its easy-to-use implementation, extensive documentation, and comprehensive tutorials, GraphCompass is accessible to biologists with varying levels of computational expertise. By facilitating comparative analyses of cell spatial organization, GraphCompass promises to be a valuable asset in advancing our understanding of tissue function in health and disease. .


Assuntos
Software , Humanos , Proteômica/métodos , Biologia Computacional/métodos , Genômica/métodos , Animais , Transcriptoma , Análise de Célula Única/métodos
7.
Genes (Basel) ; 15(6)2024 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-38927741

RESUMO

Bronchopulmonary dysplasia (BPD) is a chronic lung disease commonly affecting premature infants, with limited therapeutic options and increased long-term consequences. Adrenomedullin (Adm), a proangiogenic peptide hormone, has been found to protect rodents against experimental BPD. This study aims to elucidate the molecular and cellular mechanisms through which Adm influences BPD pathogenesis using a lipopolysaccharide (LPS)-induced model of experimental BPD in mice. Bulk RNA sequencing of Adm-sufficient (wild-type or Adm+/+) and Adm-haplodeficient (Adm+/-) mice lungs, integrated with single-cell RNA sequencing data, revealed distinct gene expression patterns and cell type alterations associated with Adm deficiency and LPS exposure. Notably, computational integration with cell atlas data revealed that Adm-haplodeficient mouse lungs exhibited gene expression signatures characteristic of increased inflammation, natural killer (NK) cell frequency, and decreased endothelial cell and type II pneumocyte frequency. Furthermore, in silico human BPD patient data analysis supported our cell type frequency finding, highlighting elevated NK cells in BPD infants. These results underscore the protective role of Adm in experimental BPD and emphasize that it is a potential therapeutic target for BPD infants with an inflammatory phenotype.


Assuntos
Adrenomedulina , Displasia Broncopulmonar , Adrenomedulina/genética , Adrenomedulina/metabolismo , Displasia Broncopulmonar/genética , Displasia Broncopulmonar/patologia , Displasia Broncopulmonar/metabolismo , Animais , Camundongos , Humanos , Análise de Sequência de RNA/métodos , Modelos Animais de Doenças , Lipopolissacarídeos , Pulmão/metabolismo , Pulmão/patologia , Células Matadoras Naturais/metabolismo , Células Matadoras Naturais/imunologia , Transcriptoma
8.
Nat Comput Sci ; 4(5): 367-378, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38730184

RESUMO

Large language models have greatly enhanced our ability to understand biology and chemistry, yet robust methods for structure-based drug discovery, quantum chemistry and structural biology are still sparse. Precise biomolecule-ligand interaction datasets are urgently needed for large language models. To address this, we present MISATO, a dataset that combines quantum mechanical properties of small molecules and associated molecular dynamics simulations of ~20,000 experimental protein-ligand complexes with extensive validation of experimental data. Starting from the existing experimental structures, semi-empirical quantum mechanics was used to systematically refine these structures. A large collection of molecular dynamics traces of protein-ligand complexes in explicit water is included, accumulating over 170 µs. We give examples of machine learning (ML) baseline models proving an improvement of accuracy by employing our data. An easy entry point for ML experts is provided to enable the next generation of drug discovery artificial intelligence models.


Assuntos
Descoberta de Drogas , Aprendizado de Máquina , Simulação de Dinâmica Molecular , Proteínas , Ligantes , Descoberta de Drogas/métodos , Proteínas/química , Proteínas/metabolismo , Teoria Quântica
9.
Cell ; 187(10): 2343-2358, 2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38729109

RESUMO

As the number of single-cell datasets continues to grow rapidly, workflows that map new data to well-curated reference atlases offer enormous promise for the biological community. In this perspective, we discuss key computational challenges and opportunities for single-cell reference-mapping algorithms. We discuss how mapping algorithms will enable the integration of diverse datasets across disease states, molecular modalities, genetic perturbations, and diverse species and will eventually replace manual and laborious unsupervised clustering pipelines.


Assuntos
Algoritmos , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Biologia Computacional/métodos , Análise de Dados , Animais , Análise por Conglomerados
10.
Res Sq ; 2024 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-38645152

RESUMO

With the growing number of single-cell analysis tools, benchmarks are increasingly important to guide analysis and method development. However, a lack of standardisation and extensibility in current benchmarks limits their usability, longevity, and relevance to the community. We present Open Problems, a living, extensible, community-guided benchmarking platform including 10 current single-cell tasks that we envision will raise standards for the selection, evaluation, and development of methods in single-cell analysis.

11.
J Hepatol ; 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38583492

RESUMO

BACKGROUND & AIMS: Polyploidy in hepatocytes has been proposed as a genetic mechanism to buffer against transcriptional dysregulation. Here, we aim to demonstrate the role of polyploidy in modulating gene regulatory networks in hepatocytes during ageing. METHODS: We performed single-nucleus RNA sequencing in hepatocyte nuclei of different ploidy levels isolated from young and old wild-type mice. Changes in the gene expression and regulatory network were compared to three independent strains that were haploinsufficient for HNF4A, CEBPA or CTCF, representing non-deleterious perturbations. Phenotypic characteristics of the liver section were additionally evaluated histologically, whereas the genomic allele composition of hepatocytes was analysed by BaseScope. RESULTS: We observed that ageing in wild-type mice results in nuclei polyploidy and a marked increase in steatosis. Haploinsufficiency of liver-specific master regulators (HFN4A or CEBPA) results in the enrichment of hepatocytes with tetraploid nuclei at a young age, affecting the genomic regulatory network, and dramatically suppressing ageing-related steatosis tissue wide. Notably, these phenotypes are not the result of subtle disruption to liver-specific transcriptional networks, since haploinsufficiency in the CTCF insulator protein resulted in the same phenotype. Further quantification of genotypes of tetraploid hepatocytes in young and old HFN4A-haploinsufficient mice revealed that during ageing, tetraploid hepatocytes lead to the selection of wild-type alleles, restoring non-deleterious genetic perturbations. CONCLUSIONS: Our results suggest a model whereby polyploidisation leads to fundamentally different cell states. Polyploid conversion enables pleiotropic buffering against age-related decline via non-random allelic segregation to restore a wild-type genome. IMPACT AND IMPLICATIONS: The functional role of hepatocyte polyploidisation during ageing is poorly understood. Using single-nucleus RNA sequencing and BaseScope approaches, we have studied ploidy dynamics during ageing in murine livers with non-deleterious genetic perturbations. We have identified that hepatocytes present different cellular states and the ability to buffer ageing-associated dysfunctions. Tetraploid nuclei exhibit robust transcriptional networks and are better adapted to genomically overcome perturbations. Novel therapeutic interventions aimed at attenuating age-related changes in tissue function could be exploited by manipulation of ploidy dynamics during chronic liver conditions.

12.
Genome Biol ; 25(1): 109, 2024 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-38671451

RESUMO

Single-cell multiplexing techniques (cell hashing and genetic multiplexing) combine multiple samples, optimizing sample processing and reducing costs. Cell hashing conjugates antibody-tags or chemical-oligonucleotides to cell membranes, while genetic multiplexing allows to mix genetically diverse samples and relies on aggregation of RNA reads at known genomic coordinates. We develop hadge (hashing deconvolution combined with genotype information), a Nextflow pipeline that combines 12 methods to perform both hashing- and genotype-based deconvolution. We propose a joint deconvolution strategy combining best-performing methods and demonstrate how this approach leads to the recovery of previously discarded cells in a nuclei hashing of fresh-frozen brain tissue.


Assuntos
Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Encéfalo/metabolismo , Encéfalo/citologia , Software , Genótipo
13.
Nat Commun ; 15(1): 2866, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38570482

RESUMO

Traumatic brain injury leads to a highly orchestrated immune- and glial cell response partially responsible for long-lasting disability and the development of secondary neurodegenerative diseases. A holistic understanding of the mechanisms controlling the responses of specific cell types and their crosstalk is required to develop an efficient strategy for better regeneration. Here, we combine spatial and single-cell transcriptomics to chart the transcriptomic signature of the injured male murine cerebral cortex, and identify specific states of different glial cells contributing to this signature. Interestingly, distinct glial cells share a large fraction of injury-regulated genes, including inflammatory programs downstream of the innate immune-associated pathways Cxcr3 and Tlr1/2. Systemic manipulation of these pathways decreases the reactivity state of glial cells associated with poor regeneration. The functional relevance of the discovered shared signature of glial cells highlights the importance of our resource enabling comprehensive analysis of early events after brain injury.


Assuntos
Lesões Encefálicas , Ferimentos Perfurantes , Animais , Camundongos , Masculino , Proteína Glial Fibrilar Ácida/metabolismo , Neuroglia/metabolismo , Lesões Encefálicas/metabolismo , Córtex Cerebral/metabolismo , Ferimentos Perfurantes/complicações , Ferimentos Perfurantes/metabolismo
14.
Nat Methods ; 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38509327

RESUMO

Spatially resolved omics technologies are transforming our understanding of biological tissues. However, the handling of uni- and multimodal spatial omics datasets remains a challenge owing to large data volumes, heterogeneity of data types and the lack of flexible, spatially aware data structures. Here we introduce SpatialData, a framework that establishes a unified and extensible multiplatform file-format, lazy representation of larger-than-memory data, transformations and alignment to common coordinate systems. SpatialData facilitates spatial annotations and cross-modal aggregation and analysis, the utility of which is illustrated in the context of multiple vignettes, including integrative analysis on a multimodal Xenium and Visium breast cancer study.

15.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38485697

RESUMO

SUMMARY: Accurate clustering of mixed data, encompassing binary, categorical, and continuous variables, is vital for effective patient stratification in clinical questionnaire analysis. To address this need, we present longmixr, a comprehensive R package providing a robust framework for clustering mixed longitudinal data using finite mixture modeling techniques. By incorporating consensus clustering, longmixr ensures reliable and stable clustering results. Moreover, the package includes a detailed vignette that facilitates cluster exploration and visualization. AVAILABILITY AND IMPLEMENTATION: The R package is freely available at https://cran.r-project.org/package=longmixr with detailed documentation, including a case vignette, at https://cellmapslab.github.io/longmixr/.


Assuntos
Software , Humanos , Estudos Transversais , Análise por Conglomerados , Inquéritos e Questionários
16.
Eur Respir J ; 63(2)2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38212077

RESUMO

BACKGROUND: Fibroblast-to-myofibroblast conversion is a major driver of tissue remodelling in organ fibrosis. Distinct lineages of fibroblasts support homeostatic tissue niche functions, yet their specific activation states and phenotypic trajectories during injury and repair have remained unclear. METHODS: We combined spatial transcriptomics, multiplexed immunostainings, longitudinal single-cell RNA-sequencing and genetic lineage tracing to study fibroblast fates during mouse lung regeneration. Our findings were validated in idiopathic pulmonary fibrosis patient tissues in situ as well as in cell differentiation and invasion assays using patient lung fibroblasts. Cell differentiation and invasion assays established a function of SFRP1 in regulating human lung fibroblast invasion in response to transforming growth factor (TGF)ß1. MEASUREMENTS AND MAIN RESULTS: We discovered a transitional fibroblast state characterised by high Sfrp1 expression, derived from both Tcf21-Cre lineage positive and negative cells. Sfrp1 + cells appeared early after injury in peribronchiolar, adventitial and alveolar locations and preceded the emergence of myofibroblasts. We identified lineage-specific paracrine signals and inferred converging transcriptional trajectories towards Sfrp1 + transitional fibroblasts and Cthrc1 + myofibroblasts. TGFß1 downregulated SFRP1 in noninvasive transitional cells and induced their switch to an invasive CTHRC1+ myofibroblast identity. Finally, using loss-of-function studies we showed that SFRP1 modulates TGFß1-induced fibroblast invasion and RHOA pathway activity. CONCLUSIONS: Our study reveals the convergence of spatially and transcriptionally distinct fibroblast lineages into transcriptionally uniform myofibroblasts and identifies SFRP1 as a modulator of TGFß1-driven fibroblast phenotypes in fibrogenesis. These findings are relevant in the context of therapeutic interventions that aim at limiting or reversing fibroblast foci formation.


Assuntos
Fibrose Pulmonar Idiopática , Miofibroblastos , Camundongos , Animais , Humanos , Miofibroblastos/metabolismo , Fibroblastos/metabolismo , Pulmão/metabolismo , Fibrose Pulmonar Idiopática/metabolismo , Diferenciação Celular , Fator de Crescimento Transformador beta1/metabolismo , Proteínas da Matriz Extracelular/metabolismo , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo
17.
Nat Methods ; 21(1): 28-31, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38049697

RESUMO

Single-cell ATAC sequencing coverage in regulatory regions is typically binarized as an indicator of open chromatin. Here we show that binarization is an unnecessary step that neither improves goodness of fit, clustering, cell type identification nor batch integration. Fragment counts, but not read counts, should instead be modeled, which preserves quantitative regulatory information. These results have immediate implications for single-cell ATAC sequencing analysis.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Cromatina/genética , Análise de Célula Única
18.
Artigo em Inglês | MEDLINE | ID: mdl-38086412

RESUMO

BACKGROUND: In optical coherence tomography (OCT) scans of patients with inherited retinal diseases (IRDs), the measurement of the thickness of the outer nuclear layer (ONL) has been well established as a surrogate marker for photoreceptor preservation. Current automatic segmentation tools fail in OCT segmentation in IRDs, and manual segmentation is time-consuming. METHODS AND MATERIAL: Patients with IRD and an available OCT scan were screened for the present study. Additionally, OCT scans of patients without retinal disease were included to provide training data for artificial intelligence (AI). We trained a U-net-based model on healthy patients and applied a domain adaption technique to the IRD patients' scans. RESULTS: We established an AI-based image segmentation algorithm that reliably segments the ONL in OCT scans of IRD patients. In a test dataset, the dice score of the algorithm was 98.7%. Furthermore, we generated thickness maps of the full retinal thickness and the ONL layer for each patient. CONCLUSION: Accurate segmentation of anatomical layers on OCT scans plays a crucial role for predictive models linking retinal structure to visual function. Our algorithm for segmentation of OCT images could provide the basis for further studies on IRDs.

19.
bioRxiv ; 2024 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-37961672

RESUMO

Integration of single-cell RNA-sequencing (scRNA-seq) datasets has become a standard part of the analysis, with conditional variational autoencoders (cVAE) being among the most popular approaches. Increasingly, researchers are asking to map cells across challenging cases such as cross-organs, species, or organoids and primary tissue, as well as different scRNA-seq protocols, including single-cell and single-nuclei. Current computational methods struggle to harmonize datasets with such substantial differences, driven by technical or biological variation. Here, we propose to address these challenges for the popular cVAE-based approaches by introducing and comparing a series of regularization constraints. The two commonly used strategies for increasing batch correction in cVAEs, that is Kullback-Leibler divergence (KL) regularization strength tuning and adversarial learning, suffer from substantial loss of biological information. Therefore, we adapt, implement, and assess alternative regularization strategies for cVAEs and investigate how they improve batch effect removal or better preserve biological variation, enabling us to propose an optimal cVAE-based integration strategy for complex systems. We show that using a VampPrior instead of the commonly used Gaussian prior not only improves the preservation of biological variation but also unexpectedly batch correction. Moreover, we show that our implementation of cycle-consistency loss leads to significantly better biological preservation than adversarial learning implemented in the previously proposed GLUE model. Additionally, we do not recommend relying only on the KL regularization strength tuning for increasing batch correction, as it removes both biological and batch information without discriminating between the two. Based on our findings, we propose a new model that combines VampPrior and cycle-consistency loss. We show that using it for datasets with substantial batch effects improves downstream interpretation of cell states and biological conditions. To ease the use of the newly proposed model, we make it available in the scvi-tools package as an external model named sysVI. Moreover, in the future, these regularization techniques could be added to other established cVAE-based models to improve the integration of datasets with substantial batch effects.

20.
Nat Methods ; 21(1): 50-59, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37735568

RESUMO

RNA velocity has been rapidly adopted to guide interpretation of transcriptional dynamics in snapshot single-cell data; however, current approaches for estimating RNA velocity lack effective strategies for quantifying uncertainty and determining the overall applicability to the system of interest. Here, we present veloVI (velocity variational inference), a deep generative modeling framework for estimating RNA velocity. veloVI learns a gene-specific dynamical model of RNA metabolism and provides a transcriptome-wide quantification of velocity uncertainty. We show that veloVI compares favorably to previous approaches with respect to goodness of fit, consistency across transcriptionally similar cells and stability across preprocessing pipelines for quantifying RNA abundance. Further, we demonstrate that veloVI's posterior velocity uncertainty can be used to assess whether velocity analysis is appropriate for a given dataset. Finally, we highlight veloVI as a flexible framework for modeling transcriptional dynamics by adapting the underlying dynamical model to use time-dependent transcription rates.


Assuntos
RNA , Transcriptoma , RNA/genética , Aprendizagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...