Search | VHL Regional Portal

1.

Modeling fragment counts improves single-cell ATAC-seq analysis.

Martens, Laura D; Fischer, David S; Yépez, Vicente A; Theis, Fabian J; Gagneur, Julien.

Nat Methods ; 21(1): 28-31, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38049697

ABSTRACT

Single-cell ATAC sequencing coverage in regulatory regions is typically binarized as an indicator of open chromatin. Here we show that binarization is an unnecessary step that neither improves goodness of fit, clustering, cell type identification nor batch integration. Fragment counts, but not read counts, should instead be modeled, which preserves quantitative regulatory information. These results have immediate implications for single-cell ATAC sequencing analysis.

Subject(s)

Chromatin Immunoprecipitation Sequencing , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Chromatin/genetics , Single-Cell Analysis

2.

Programs, Origins, and Niches of Immunomodulatory Myeloid Cells in Gliomas.

Miller, Tyler E; El Farran, Chadi A; Couturier, Charles P; Chen, Zeyu; D'Antonio, Joshua P; Verga, Julia; Villanueva, Martin A; Castro, L Nicolas Gonzalez; Tong, Yuzhou Evelyn; Saadi, Tariq Al; Chiocca, Andrew N; Fischer, David S; Heiland, Dieter Henrik; Guerriero, Jennifer L; Petrecca, Kevin; Suva, Mario L; Shalek, Alex K; Bernstein, Bradley E.

bioRxiv ; 2023 Oct 27.

Article in English | MEDLINE | ID: mdl-37961527

ABSTRACT

Gliomas are incurable malignancies notable for an immunosuppressive microenvironment with abundant myeloid cells whose immunomodulatory properties remain poorly defined. Here, utilizing scRNA-seq data for 183,062 myeloid cells from 85 human tumors, we discover that nearly all glioma-associated myeloid cells express at least one of four immunomodulatory activity programs: Scavenger Immunosuppressive, C1Q Immunosuppressive, CXCR4 Inflammatory, and IL1B Inflammatory. All four programs are present in IDH1 mutant and wild-type gliomas and are expressed in macrophages, monocytes, and microglia whether of blood or resident myeloid cell origins. Integrating our scRNA-seq data with mitochondrial DNA-based lineage tracing, spatial transcriptomics, and organoid explant systems that model peripheral monocyte infiltration, we show that these programs are driven by microenvironmental cues and therapies rather than myeloid cell type, origin, or mutation status. The C1Q Immunosuppressive program is driven by routinely administered dexamethasone. The Scavenger Immunosuppressive program includes ligands with established roles in T-cell suppression, is induced in hypoxic regions, and is associated with immunotherapy resistance. Both immunosuppressive programs are less prevalent in lower-grade gliomas, which are instead enriched for the CXCR4 Inflammatory program. Our study provides a framework to understand immunomodulatory myeloid cells in glioma, and a foundation to develop more effective immunotherapies.

3.

Scaling cross-tissue single-cell annotation models.

Fischer, Felix; Fischer, David S; Biederstedt, Evan; Villani, Alexandra-Chloé; Theis, Fabian J.

bioRxiv ; 2023 Oct 10.

Article in English | MEDLINE | ID: mdl-37873298

ABSTRACT

Identifying cellular identities (both novel and well-studied) is one of the key use cases in single-cell transcriptomics. While supervised machine learning has been leveraged to automate cell annotation predictions for some time, there has been relatively little progress both in scaling neural networks to large data sets and in constructing models that generalize well across diverse tissues and biological contexts up to whole organisms. Here, we propose scTab, an automated, feature-attention-based cell type prediction model specific to tabular data, and train it using a novel data augmentation scheme across a large corpus of single-cell RNA-seq observations (22.2 million human cells in total). In addition, scTab leverages deep ensembles for uncertainty quantification. Moreover, we account for ontological relationships between labels in the model evaluation to accommodate for differences in annotation granularity across datasets. On this large-scale corpus, we show that cross-tissue annotation requires nonlinear models and that the performance of scTab scales in terms of training dataset size as well as model size - demonstrating the advantage of scTab over current state-of-the-art linear models in this context. Additionally, we show that the proposed data augmentation schema improves model generalization. In summary, we introduce a de novo cell type prediction model for single-cell RNA-seq data that can be trained across a large-scale collection of curated datasets from a diverse selection of human tissues and demonstrate the benefits of using deep learning methods in this paradigm. Our codebase, training data, and model checkpoints are publicly available at https://github.com/theislab/scTab to further enable rigorous benchmarks of foundation models for single-cell RNA-seq data.

4.

Modeling intercellular communication in tissues using spatial graphs of cells.

Fischer, David S; Schaar, Anna C; Theis, Fabian J.

Nat Biotechnol ; 41(3): 332-336, 2023 03.

Article in English | MEDLINE | ID: mdl-36302986

ABSTRACT

Models of intercellular communication in tissues are based on molecular profiles of dissociated cells, are limited to receptor-ligand signaling and ignore spatial proximity in situ. We present node-centric expression modeling, a method based on graph neural networks that estimates the effects of niche composition on gene expression in an unbiased manner from spatial molecular profiling data. We recover signatures of molecular processes known to underlie cell communication.

Subject(s)

Cell Communication , Signal Transduction , Cell Communication/genetics , Signal Transduction/genetics , Neural Networks, Computer

5.

Probing cell identity hierarchies by fate titration and collision during direct reprogramming.

Hersbach, Bob A; Fischer, David S; Masserdotti, Giacomo; Mojzisová, Karolina; Waltzhöni, Thomas; Rodriguez-Terrones, Diego; Heinig, Matthias; Theis, Fabian J; Götz, Magdalena; Stricker, Stefan H.

Mol Syst Biol ; 18(9): e11129, 2022 09.

Article in English | MEDLINE | ID: mdl-36106915

ABSTRACT

Despite the therapeutic promise of direct reprogramming, basic principles concerning fate erasure and the mechanisms to resolve cell identity conflicts remain unclear. To tackle these fundamental questions, we established a single-cell protocol for the simultaneous analysis of multiple cell fate conversion events based on combinatorial and traceable reprogramming factor expression: Collide-seq. Collide-seq revealed the lack of a common mechanism through which fibroblast-specific gene expression loss is initiated. Moreover, we found that the transcriptome of converting cells abruptly changes when a critical level of each reprogramming factor is attained, with higher or lower levels not contributing to major changes. By simultaneously inducing multiple competing reprogramming factors, we also found a deterministic system, in which titration of fates against each other yields dominant or colliding fates. By investigating one collision in detail, we show that reprogramming factors can disturb cell identity programs independent of their ability to bind their target genes. Taken together, Collide-seq has shed light on several fundamental principles of fate conversion that may aid in improving current reprogramming paradigms.

Subject(s)

Cellular Reprogramming , Fibroblasts , Cell Differentiation/genetics , Cellular Reprogramming/genetics , Fibroblasts/metabolism , Transcriptome/genetics

6.

Spatial components of molecular tissue biology.

Palla, Giovanni; Fischer, David S; Regev, Aviv; Theis, Fabian J.

Nat Biotechnol ; 40(3): 308-318, 2022 03.

Article in English | MEDLINE | ID: mdl-35132261

ABSTRACT

Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analysis of tissues and develop the requisite computational tools to address them. Developers of analytical tools need to decide on the intrinsic molecular features of each cell that need to be considered, and how cell shape and morphological features are incorporated into the analysis. Also, optimal ways to compare different tissue samples at various length scales are still being sought. Grouping these biological problems and related computational algorithms into classes across length scales, thus characterizing common issues that need to be addressed, will facilitate further progress in spatial transcriptomics and proteomics.

Subject(s)

Proteomics , Transcriptome , Algorithms , Computational Biology/methods , Spatial Analysis , Transcriptome/genetics

7.

Ultra-high sensitivity mass spectrometry quantifies single-cell proteome changes upon perturbation.

Brunner, Andreas-David; Thielert, Marvin; Vasilopoulou, Catherine; Ammar, Constantin; Coscia, Fabian; Mund, Andreas; Hoerning, Ole B; Bache, Nicolai; Apalategui, Amalia; Lubeck, Markus; Richter, Sabrina; Fischer, David S; Raether, Oliver; Park, Melvin A; Meier, Florian; Theis, Fabian J; Mann, Matthias.

Mol Syst Biol ; 18(3): e10798, 2022 03.

Article in English | MEDLINE | ID: mdl-35226415

ABSTRACT

Single-cell technologies are revolutionizing biology but are today mainly limited to imaging and deep sequencing. However, proteins are the main drivers of cellular function and in-depth characterization of individual cells by mass spectrometry (MS)-based proteomics would thus be highly valuable and complementary. Here, we develop a robust workflow combining miniaturized sample preparation, very low flow-rate chromatography, and a novel trapped ion mobility mass spectrometer, resulting in a more than 10-fold improved sensitivity. We precisely and robustly quantify proteomes and their changes in single, FACS-isolated cells. Arresting cells at defined stages of the cell cycle by drug treatment retrieves expected key regulators. Furthermore, it highlights potential novel ones and allows cell phase prediction. Comparing the variability in more than 430 single-cell proteomes to transcriptome data revealed a stable-core proteome despite perturbation, while the transcriptome appears stochastic. Our technology can readily be applied to ultra-high sensitivity analyses of tissue material, posttranslational modifications, and small molecule studies from small cell counts to gain unprecedented insights into cellular heterogeneity in health and disease.

Subject(s)

Proteome , Proteomics , Mass Spectrometry/methods , Protein Processing, Post-Translational , Proteome/metabolism , Proteomics/methods , Workflow

8.

Cell-Type-Specific Impact of Glucocorticoid Receptor Activation on the Developing Brain: A Cerebral Organoid Study.

Cruceanu, Cristiana; Dony, Leander; Krontira, Anthi C; Fischer, David S; Roeh, Simone; Di Giaimo, Rossella; Kyrousi, Christina; Kaspar, Lea; Arloth, Janine; Czamara, Darina; Gerstner, Nathalie; Martinelli, Silvia; Wehner, Stefanie; Breen, Michael S; Koedel, Maik; Sauer, Susann; Sportelli, Vincenza; Rex-Haffner, Monika; Cappello, Silvia; Theis, Fabian J; Binder, Elisabeth B.

Am J Psychiatry ; 179(5): 375-387, 2022 05.

Article in English | MEDLINE | ID: mdl-34698522

ABSTRACT

OBJECTIVE: A fine-tuned balance of glucocorticoid receptor (GR) activation is essential for organ formation, with disturbances influencing many health outcomes. In utero, glucocorticoids have been linked to brain-related negative outcomes, with unclear underlying mechanisms, especially regarding cell-type-specific effects. An in vitro model of fetal human brain development, induced human pluripotent stem cell (hiPSC)-derived cerebral organoids, was used to test whether cerebral organoids are suitable for studying the impact of prenatal glucocorticoid exposure on the developing brain. METHODS: The GR was activated with the synthetic glucocorticoid dexamethasone, and the effects were mapped using single-cell transcriptomics across development. RESULTS: The GR was expressed in all cell types, with increasing expression levels through development. Not only did its activation elicit translocation to the nucleus and the expected effects on known GR-regulated pathways, but also neurons and progenitor cells showed targeted regulation of differentiation- and maturation-related transcripts. Uniquely in neurons, differentially expressed transcripts were significantly enriched for genes associated with behavior-related phenotypes and disorders. This human neuronal glucocorticoid response profile was validated across organoids from three independent hiPSC lines reprogrammed from different source tissues from both male and female donors. CONCLUSIONS: These findings suggest that excessive glucocorticoid exposure could interfere with neuronal maturation in utero, leading to increased disease susceptibility through neurodevelopmental processes at the interface of genetic susceptibility and environmental exposure. Cerebral organoids are a valuable translational resource for exploring the effects of glucocorticoids on early human brain development.

Subject(s)

Induced Pluripotent Stem Cells , Receptors, Glucocorticoid , Brain/metabolism , Dexamethasone/metabolism , Dexamethasone/pharmacology , Female , Glucocorticoids/adverse effects , Humans , Induced Pluripotent Stem Cells/metabolism , Male , Organoids/metabolism , Pregnancy , Receptors, Glucocorticoid/genetics

9.

Toward modeling metabolic state from single-cell transcriptomics.

Hrovatin, Karin; Fischer, David S; Theis, Fabian J.

Mol Metab ; 57: 101396, 2022 03.

Article in English | MEDLINE | ID: mdl-34785394

ABSTRACT

BACKGROUND: Single-cell metabolic studies bring new insights into cellular function, which can often not be captured on other omics layers. Metabolic information has wide applicability, such as for the study of cellular heterogeneity or for the understanding of drug mechanisms and biomarker development. However, metabolic measurements on single-cell level are limited by insufficient scalability and sensitivity, as well as resource intensiveness, and are currently not possible in parallel with measuring transcript state, commonly used to identify cell types. Nevertheless, because omics layers are strongly intertwined, it is possible to make metabolic predictions based on measured data of more easily measurable omics layers together with prior metabolic network knowledge. SCOPE OF REVIEW: We summarize the current state of single-cell metabolic measurement and modeling approaches, motivating the use of computational techniques. We review three main classes of computational methods used for prediction of single-cell metabolism: pathway-level analysis, constraint-based modeling, and kinetic modeling. We describe the unique challenges arising when transitioning from bulk to single-cell modeling. Finally, we propose potential model extensions and computational methods that could be leveraged to achieve these goals. MAJOR CONCLUSIONS: Single-cell metabolic modeling is a rising field that provides a new perspective for understanding cellular functions. The presented modeling approaches vary in terms of input requirements and assumptions, scalability, modeled metabolic layers, and newly gained insights. We believe that the use of prior metabolic knowledge will lead to more robust predictions and will pave the way for mechanistic and interpretable machine-learning models.

Subject(s)

Models, Biological , Transcriptome , Kinetics , Metabolic Networks and Pathways/genetics

10.

EpiScanpy: integrated single-cell epigenomic analysis.

Danese, Anna; Richter, Maria L; Chaichoompu, Kridsadakorn; Fischer, David S; Theis, Fabian J; Colomé-Tatché, Maria.

Nat Commun ; 12(1): 5228, 2021 09 01.

Article in English | MEDLINE | ID: mdl-34471111

ABSTRACT

EpiScanpy is a toolkit for the analysis of single-cell epigenomic data, namely single-cell DNA methylation and single-cell ATAC-seq data. To address the modality specific challenges from epigenomics data, epiScanpy quantifies the epigenome using multiple feature space constructions and builds a nearest neighbour graph using epigenomic distance between cells. EpiScanpy makes the many existing scRNA-seq workflows from scanpy available to large-scale single-cell data from other -omics modalities, including methods for common clustering, dimension reduction, cell type identification and trajectory learning techniques, as well as an atlas integration tool for scATAC-seq datasets. The toolkit also features numerous useful downstream functions, such as differential methylation and differential openness calling, mapping epigenomic features of interest to their nearest gene, or constructing gene activity matrices using chromatin openness. We successfully benchmark epiScanpy against other scATAC-seq analysis tools and show its outperformance at discriminating cell types.

Subject(s)

Epigenomics/methods , Single-Cell Analysis/methods , Chromatin , Chromatin Immunoprecipitation Sequencing , Cluster Analysis , DNA Methylation , Humans , Sequence Analysis, RNA

11.

Group Testing for SARS-CoV-2 Allows for Up to 10-Fold Efficiency Increase Across Realistic Scenarios and Testing Strategies.

Verdun, Claudio M; Fuchs, Tim; Harar, Pavol; Elbrächter, Dennis; Fischer, David S; Berner, Julius; Grohs, Philipp; Theis, Fabian J; Krahmer, Felix.

Front Public Health ; 9: 583377, 2021.

Article in English | MEDLINE | ID: mdl-34490172

ABSTRACT

Background: Due to the ongoing COVID-19 pandemic, demand for diagnostic testing has increased drastically, resulting in shortages of necessary materials to conduct the tests and overwhelming the capacity of testing laboratories. The supply scarcity and capacity limits affect test administration: priority must be given to hospitalized patients and symptomatic individuals, which can prevent the identification of asymptomatic and presymptomatic individuals and hence effective tracking and tracing policies. We describe optimized group testing strategies applicable to SARS-CoV-2 tests in scenarios tailored to the current COVID-19 pandemic and assess significant gains compared to individual testing. Methods: We account for biochemically realistic scenarios in the context of dilution effects on SARS-CoV-2 samples and consider evidence on specificity and sensitivity of PCR-based tests for the novel coronavirus. Because of the current uncertainty and the temporal and spatial changes in the prevalence regime, we provide analysis for several realistic scenarios and propose fast and reliable strategies for massive testing procedures. Key Findings: We find significant efficiency gaps between different group testing strategies in realistic scenarios for SARS-CoV-2 testing, highlighting the need for an informed decision of the pooling protocol depending on estimated prevalence, target specificity, and high- vs. low-risk population. For example, using one of the presented methods, all 1.47 million inhabitants of Munich, Germany, could be tested using only around 141 thousand tests if the infection rate is below 0.4% is assumed. Using 1 million tests, the 6.69 million inhabitants from the city of Rio de Janeiro, Brazil, could be tested as long as the infection rate does not exceed 1%. Moreover, we provide an interactive web application, available at www.grouptexting.com, for visualizing the different strategies and designing pooling schemes according to specific prevalence scenarios and test configurations. Interpretation: Altogether, this work may help provide a basis for an efficient upscaling of current testing procedures, which takes the population heterogeneity into account and is fine-grained towards the desired study populations, e.g., mild/asymptomatic individuals vs. symptomatic ones but also mixtures thereof. Funding: German Science Foundation (DFG), German Federal Ministry of Education and Research (BMBF), Chan Zuckerberg Initiative DAF, and Austrian Science Fund (FWF).

Subject(s)

COVID-19 , SARS-CoV-2 , Brazil , COVID-19 Testing , Humans , Pandemics

12.

Sfaira accelerates data and model reuse in single cell genomics.

Fischer, David S; Dony, Leander; König, Martin; Moeed, Abdul; Zappia, Luke; Heumos, Lukas; Tritschler, Sophie; Holmberg, Olle; Aliee, Hananeh; Theis, Fabian J.

Genome Biol ; 22(1): 248, 2021 08 25.

Article in English | MEDLINE | ID: mdl-34433466

ABSTRACT

Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.

Subject(s)

Genomics , Single-Cell Analysis , Animals , Databases, Genetic , Gene Ontology , Humans , Mice , Molecular Sequence Annotation , Reproducibility of Results , Statistics as Topic

13.

Single-cell RNA sequencing reveals ex vivo signatures of SARS-CoV-2-reactive T cells through 'reverse phenotyping'.

Fischer, David S; Ansari, Meshal; Wagner, Karolin I; Jarosch, Sebastian; Huang, Yiqi; Mayr, Christoph H; Strunz, Maximilian; Lang, Niklas J; D'Ippolito, Elvira; Hammel, Monika; Mateyka, Laura; Weber, Simone; Wolff, Lisa S; Witter, Klaus; Fernandez, Isis E; Leuschner, Gabriela; Milger, Katrin; Frankenberger, Marion; Nowak, Lorenz; Heinig-Menhard, Katharina; Koch, Ina; Stoleriu, Mircea G; Hilgendorff, Anne; Behr, Jürgen; Pichlmair, Andreas; Schubert, Benjamin; Theis, Fabian J; Busch, Dirk H; Schiller, Herbert B; Schober, Kilian.

Nat Commun ; 12(1): 4515, 2021 07 26.

Article in English | MEDLINE | ID: mdl-34312385

ABSTRACT

The in vivo phenotypic profile of T cells reactive to severe acute respiratory syndrome (SARS)-CoV-2 antigens remains poorly understood. Conventional methods to detect antigen-reactive T cells require in vitro antigenic re-stimulation or highly individualized peptide-human leukocyte antigen (pHLA) multimers. Here, we use single-cell RNA sequencing to identify and profile SARS-CoV-2-reactive T cells from Coronavirus Disease 2019 (COVID-19) patients. To do so, we induce transcriptional shifts by antigenic stimulation in vitro and take advantage of natural T cell receptor (TCR) sequences of clonally expanded T cells as barcodes for 'reverse phenotyping'. This allows identification of SARS-CoV-2-reactive TCRs and reveals phenotypic effects introduced by antigen-specific stimulation. We characterize transcriptional signatures of currently and previously activated SARS-CoV-2-reactive T cells, and show correspondence with phenotypes of T cells from the respiratory tract of patients with severe disease in the presence or absence of virus in independent cohorts. Reverse phenotyping is a powerful tool to provide an integrated insight into cellular states of SARS-CoV-2-reactive T cells across tissues and activation states.

Subject(s)

COVID-19/immunology , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , T-Lymphocytes/metabolism , Aged , Aged, 80 and over , CD4-Positive T-Lymphocytes/metabolism , CD4-Positive T-Lymphocytes/virology , COVID-19/epidemiology , COVID-19/virology , Cells, Cultured , Cohort Studies , Female , Humans , Male , Middle Aged , Pandemics , Receptors, Antigen, T-Cell/genetics , Receptors, Antigen, T-Cell/immunology , Receptors, Antigen, T-Cell/metabolism , SARS-CoV-2/physiology , T-Lymphocytes/virology

14.

Single-cell meta-analysis of SARS-CoV-2 entry genes across tissues and demographics.

Muus, Christoph; Luecken, Malte D; Eraslan, Gökcen; Sikkema, Lisa; Waghray, Avinash; Heimberg, Graham; Kobayashi, Yoshihiko; Vaishnav, Eeshit Dhaval; Subramanian, Ayshwarya; Smillie, Christopher; Jagadeesh, Karthik A; Duong, Elizabeth Thu; Fiskin, Evgenij; Torlai Triglia, Elena; Ansari, Meshal; Cai, Peiwen; Lin, Brian; Buchanan, Justin; Chen, Sijia; Shu, Jian; Haber, Adam L; Chung, Hattie; Montoro, Daniel T; Adams, Taylor; Aliee, Hananeh; Allon, Samuel J; Andrusivova, Zaneta; Angelidis, Ilias; Ashenberg, Orr; Bassler, Kevin; Bécavin, Christophe; Benhar, Inbal; Bergenstråhle, Joseph; Bergenstråhle, Ludvig; Bolt, Liam; Braun, Emelie; Bui, Linh T; Callori, Steven; Chaffin, Mark; Chichelnitskiy, Evgeny; Chiou, Joshua; Conlon, Thomas M; Cuoco, Michael S; Cuomo, Anna S E; Deprez, Marie; Duclos, Grant; Fine, Denise; Fischer, David S; Ghazanfar, Shila; Gillich, Astrid.

Nat Med ; 27(3): 546-559, 2021 03.

Article in English | MEDLINE | ID: mdl-33654293

ABSTRACT

Angiotensin-converting enzyme 2 (ACE2) and accessory proteases (TMPRSS2 and CTSL) are needed for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cellular entry, and their expression may shed light on viral tropism and impact across the body. We assessed the cell-type-specific expression of ACE2, TMPRSS2 and CTSL across 107 single-cell RNA-sequencing studies from different tissues. ACE2, TMPRSS2 and CTSL are coexpressed in specific subsets of respiratory epithelial cells in the nasal passages, airways and alveoli, and in cells from other organs associated with coronavirus disease 2019 (COVID-19) transmission or pathology. We performed a meta-analysis of 31 lung single-cell RNA-sequencing studies with 1,320,896 cells from 377 nasal, airway and lung parenchyma samples from 228 individuals. This revealed cell-type-specific associations of age, sex and smoking with expression levels of ACE2, TMPRSS2 and CTSL. Expression of entry factors increased with age and in males, including in airway secretory cells and alveolar type 2 cells. Expression programs shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues included genes that may mediate viral entry, key immune functions and epithelial-macrophage cross-talk, such as genes involved in the interleukin-6, interleukin-1, tumor necrosis factor and complement pathways. Cell-type-specific expression patterns may contribute to the pathogenesis of COVID-19, and our work highlights putative molecular pathways for therapeutic intervention.

Subject(s)

COVID-19/epidemiology , COVID-19/genetics , Host-Pathogen Interactions/genetics , SARS-CoV-2/physiology , Sequence Analysis, RNA/statistics & numerical data , Single-Cell Analysis/statistics & numerical data , Virus Internalization , Adult , Aged , Aged, 80 and over , Alveolar Epithelial Cells/metabolism , Alveolar Epithelial Cells/virology , Angiotensin-Converting Enzyme 2/genetics , Angiotensin-Converting Enzyme 2/metabolism , COVID-19/pathology , COVID-19/virology , Cathepsin L/genetics , Cathepsin L/metabolism , Datasets as Topic/statistics & numerical data , Demography , Female , Gene Expression Profiling/statistics & numerical data , Humans , Lung/metabolism , Lung/virology , Male , Middle Aged , Organ Specificity/genetics , Respiratory System/metabolism , Respiratory System/virology , Sequence Analysis, RNA/methods , Serine Endopeptidases/genetics , Serine Endopeptidases/metabolism , Single-Cell Analysis/methods

15.

Predicting antigen specificity of single T cells based on TCR CDR3 regions.

Fischer, David S; Wu, Yihan; Schubert, Benjamin; Theis, Fabian J.

Mol Syst Biol ; 16(8): e9416, 2020 08.

Article in English | MEDLINE | ID: mdl-32779888

ABSTRACT

It has recently become possible to simultaneously assay T-cell specificity with respect to large sets of antigens and the T-cell receptor sequence in high-throughput single-cell experiments. Leveraging this new type of data, we propose and benchmark a collection of deep learning architectures to model T-cell specificity in single cells. In agreement with previous results, we found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly. Moreover, we show that variability in single-cell immune repertoire screens can be mitigated by modeling cell-specific covariates. Lastly, we demonstrate that the number of bound pMHC complexes can be predicted in a continuous fashion providing a gateway to disentangle cell-to-dextramer binding strength and receptor-to-pMHC affinity. We provide these models in the Python package TcellMatch to allow imputation of antigen specificities in single-cell RNA-seq studies on T cells without the need for MHC staining.

Subject(s)

Computational Biology/methods , Histocompatibility Antigens/metabolism , Receptor-CD3 Complex, Antigen, T-Cell/metabolism , Single-Cell Analysis/methods , T-Lymphocytes/immunology , Amino Acid Sequence , Animals , Deep Learning , Histocompatibility Antigens/genetics , Humans , Receptor-CD3 Complex, Antigen, T-Cell/genetics , Sequence Analysis, RNA , Supervised Machine Learning

16.

Automatic identification of relevant genes from low-dimensional embeddings of single-cell RNA-seq data.

Angerer, Philipp; Fischer, David S; Theis, Fabian J; Scialdone, Antonio; Marr, Carsten.

Bioinformatics ; 36(15): 4291-4295, 2020 08 01.

Article in English | MEDLINE | ID: mdl-32207520

ABSTRACT

MOTIVATION: Dimensionality reduction is a key step in the analysis of single-cell RNA-sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of large, heterogeneous single-cell data. However, with no linear relation between gene and embedding coordinate, there is no way to extract the identity of genes driving any cell's position in the low-dimensional embedding, making it difficult to characterize the underlying biological processes. RESULTS: In this article, we introduce the concepts of local and global gene relevance to compute an equivalent of principal component analysis loadings for non-linear low-dimensional embeddings. Global gene relevance identifies drivers of the overall embedding, while local gene relevance identifies those of a defined sub-region. We apply our method to single-cell RNA-seq datasets from different experimental protocols and to different low-dimensional embedding techniques. This shows our method's versatility to identify key genes for a variety of biological processes. AVAILABILITY AND IMPLEMENTATION: To ensure reproducibility and ease of use, our method is released as part of destiny 3.0, a popular R package for building diffusion maps from single-cell transcriptomic data. It is readily available through Bioconductor. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

RNA-Seq , RNA , Gene Expression Profiling , Principal Component Analysis , RNA/genetics , Reproducibility of Results , Sequence Analysis, RNA , Single-Cell Analysis , Software

17.

MPRAnalyze: statistical framework for massively parallel reporter assays.

Ashuach, Tal; Fischer, David S; Kreimer, Anat; Ahituv, Nadav; Theis, Fabian J; Yosef, Nir.

Genome Biol ; 20(1): 183, 2019 09 02.

Article in English | MEDLINE | ID: mdl-31477158

ABSTRACT

Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods.

Subject(s)

Biological Assay , Genes, Reporter , High-Throughput Nucleotide Sequencing/methods , Software , Statistics as Topic , Alleles , Databases, Genetic , Gene Expression Profiling , Hep G2 Cells , Humans , K562 Cells

18.

Concepts and limitations for learning developmental trajectories from single cell genomics.

Tritschler, Sophie; Büttner, Maren; Fischer, David S; Lange, Marius; Bergen, Volker; Lickert, Heiko; Theis, Fabian J.

Development ; 146(12)2019 06 27.

Article in English | MEDLINE | ID: mdl-31249007

ABSTRACT

Single cell genomics has become a popular approach to uncover the cellular heterogeneity of progenitor and terminally differentiated cell types with great precision. This approach can also delineate lineage hierarchies and identify molecular programmes of cell-fate acquisition and segregation. Nowadays, tens of thousands of cells are routinely sequenced in single cell-based methods and even more are expected to be analysed in the future. However, interpretation of the resulting data is challenging and requires computational models at multiple levels of abstraction. In contrast to other applications of single cell sequencing, where clustering approaches dominate, developmental systems are generally modelled using continuous structures, trajectories and trees. These trajectory models carry the promise of elucidating mechanisms of development, disease and stimulation response at very high molecular resolution. However, their reliable analysis and biological interpretation requires an understanding of their underlying assumptions and limitations. Here, we review the basic concepts of such computational approaches and discuss the characteristics of developmental processes that can be learnt from trajectory models.

Subject(s)

Genomics/methods , Single-Cell Analysis/methods , Algorithms , Animals , Cell Differentiation , Cell Lineage , Cell Proliferation , Chromatin/chemistry , Computational Biology/methods , Developmental Biology/trends , Humans , Methylation , Mice , Models, Biological , Nonlinear Dynamics , Proteomics , RNA/chemistry , RNA Splicing , Sequence Analysis, RNA , Software , Stem Cells/cytology

19.

Inferring population dynamics from single-cell RNA-sequencing time series data.

Fischer, David S; Fiedler, Anna K; Kernfeld, Eric M; Genga, Ryan M J; Bastidas-Ponce, Aimée; Bakhti, Mostafa; Lickert, Heiko; Hasenauer, Jan; Maehr, Rene; Theis, Fabian J.

Nat Biotechnol ; 37(4): 461-468, 2019 04.

Article in English | MEDLINE | ID: mdl-30936567

ABSTRACT

Recent single-cell RNA-sequencing studies have suggested that cells follow continuous transcriptomic trajectories in an asynchronous fashion during development. However, observations of cell flux along trajectories are confounded with population size effects in snapshot experiments and are therefore hard to interpret. In particular, changes in proliferation and death rates can be mistaken for cell flux. Here we present pseudodynamics, a mathematical framework that reconciles population dynamics with the concepts underlying developmental trajectories inferred from time-series single-cell data. Pseudodynamics models population distribution shifts across trajectories to quantify selection pressure, population expansion, and developmental potentials. Applying this model to time-resolved single-cell RNA-sequencing of T-cell and pancreatic beta cell maturation, we characterize proliferation and apoptosis rates and identify key developmental checkpoints, data inaccessible to existing approaches.

Subject(s)

Cell Differentiation/genetics , Sequence Analysis, RNA/statistics & numerical data , Single-Cell Analysis/statistics & numerical data , Animals , Apoptosis/genetics , Biotechnology , Cell Proliferation/genetics , Female , Insulin-Secreting Cells/cytology , Insulin-Secreting Cells/metabolism , Likelihood Functions , Male , Mice , Mice, Inbred C57BL , Mice, Knockout , Models, Biological , Mouse Embryonic Stem Cells/cytology , Mouse Embryonic Stem Cells/metabolism , T-Lymphocytes/cytology , T-Lymphocytes/metabolism , Time Factors

20.

Impulse model-based differential expression analysis of time course sequencing data.

Fischer, David S; Theis, Fabian J; Yosef, Nir.

Nucleic Acids Res ; 46(20): e119, 2018 11 16.

Article in English | MEDLINE | ID: mdl-30102402

ABSTRACT

Temporal changes to the concentration of molecular species such as mRNA, which take place in response to various environmental cues, can often be modeled as simple continuous functions such as a single pulse (impulse) model. The simplicity of such functional representations can provide an improved performance on fundamental tasks such as noise reduction, imputation and differential expression analysis. However, temporal gene expression profiles are often studied with models that treat time as a categorical variable, neglecting the dependence between time points. Here, we present ImpulseDE2, a framework for differential expression analysis that combines the power of the impulse model as a continuous representation of temporal responses along with a noise model tailored specifically to sequencing data. We compare the simple categorical models to ImpulseDE2 and to other continuous models based on natural cubic splines and demonstrate the utility of the continuous approach for studying differential expression in time course sequencing experiments. A unique feature of ImpulseDE2 is the ability to distinguish permanently from transiently up- or down-regulated genes. Using an in vitro differentiation dataset, we demonstrate that this gene classification scheme can be used to highlight distinct transcriptional programs that are associated with different phases of the differentiation process.

Subject(s)

Gene Expression , Models, Genetic , Sequence Analysis, RNA/methods , Algorithms , Case-Control Studies , Chromatin/genetics , Chromatin/metabolism , Datasets as Topic , Humans , Likelihood Functions , RNA, Messenger/analysis , RNA, Messenger/biosynthesis , RNA, Messenger/genetics , Time Factors , Transcriptome

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL