Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 81
Filter
1.
Mol Psychiatry ; 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38844534

ABSTRACT

Understanding the shared and divergent mechanisms across antidepressant (AD) classes and probiotics is critical for improving treatment for mood disorders. Here we examine the transcriptomic effects of bupropion (NDRI), desipramine (SNRI), fluoxetine (SSRI) and a probiotic formulation (Lacidofil®) on 10 regions across the mammalian brain. These treatments massively alter gene expression (on average, 2211 differentially expressed genes (DEGs) per region-treatment combination), highlighting the biological complexity of AD and probiotic action. Intersection of DEG sets against neuropsychiatric GWAS loci, sex-specific transcriptomic portraits of major depressive disorder (MDD), and mouse models of stress and depression reveals significant similarities and differences across treatments. Interestingly, molecular responses in the infralimbic cortex, basolateral amygdala and locus coeruleus are region-specific and highly similar across treatments, whilst responses in the Raphe, medial preoptic area, cingulate cortex, prelimbic cortex and ventral dentate gyrus are predominantly treatment-specific. Mechanistically, ADs concordantly downregulate immune pathways in the amygdala and ventral dentate gyrus. In contrast, protein synthesis, metabolism and synaptic signaling pathways are axes of variability among treatments. We use spatial transcriptomics to further delineate layer-specific molecular pathways and DEGs within the prefrontal cortex. Our study reveals complex AD and probiotics action on the mammalian brain and identifies treatment-specific cellular processes and gene targets associated with mood disorders.

2.
Genome Biol ; 25(1): 94, 2024 04 15.
Article in English | MEDLINE | ID: mdl-38622708

ABSTRACT

Recent innovations in single-cell RNA-sequencing (scRNA-seq) provide the technology to investigate biological questions at cellular resolution. Pooling cells from multiple individuals has become a common strategy, and droplets can subsequently be assigned to a specific individual by leveraging their inherent genetic differences. An implicit challenge with scRNA-seq is the occurrence of doublets-droplets containing two or more cells. We develop Demuxafy, a framework to enhance donor assignment and doublet removal through the consensus intersection of multiple demultiplexing and doublet detecting methods. Demuxafy significantly improves droplet assignment by separating singlets from doublets and classifying the correct individual.


Subject(s)
Single-Cell Analysis , Humans , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods
3.
Genome Med ; 16(1): 45, 2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38539228

ABSTRACT

BACKGROUND: Type 1 diabetes mellitus (T1DM) is a prototypic endocrine autoimmune disease resulting from an immune-mediated destruction of pancreatic insulin-secreting ß  cells. A comprehensive immune cell phenotype evaluation in T1DM has not been performed thus far at the single-cell level. METHODS: In this cross-sectional analysis, we generated a single-cell transcriptomic dataset of peripheral blood mononuclear cells (PBMCs) from 46 manifest T1DM (stage 3) cases and 31 matched controls. RESULTS: We surprisingly detected profound alterations in circulatory immune cells (1784 dysregulated genes in 13 immune cell types), far exceeding the count in the comparator systemic autoimmune disease SLE. Genes upregulated in T1DM were involved in WNT signaling, interferon signaling and migration of T/NK cells, antigen presentation by B cells, and monocyte activation. A significant fraction of these differentially expressed genes were also altered in T1DM pancreatic islets. We used the single-cell data to construct a T1DM metagene z-score (TMZ score) that distinguished cases and controls and classified patients into molecular subtypes. This score correlated with known prognostic immune markers of T1DM, as well as with drug response in clinical trials. CONCLUSIONS: Our study reveals a surprisingly strong systemic dimension at the level of immune cell network in T1DM, defines disease-relevant molecular subtypes, and has the potential to guide non-invasive test development and patient stratification.


Subject(s)
Autoimmune Diseases , Diabetes Mellitus, Type 1 , Humans , Diabetes Mellitus, Type 1/genetics , Leukocytes, Mononuclear/metabolism , Cross-Sectional Studies , Single-Cell Gene Expression Analysis
4.
Nat Commun ; 15(1): 2342, 2024 Mar 15.
Article in English | MEDLINE | ID: mdl-38491027

ABSTRACT

High-dimensional, spatially resolved analysis of intact tissue samples promises to transform biomedical research and diagnostics, but existing spatial omics technologies are costly and labor-intensive. We present Fluorescence In Situ Hybridization of Cellular HeterogeneIty and gene expression Programs (FISHnCHIPs) for highly sensitive in situ profiling of cell types and gene expression programs. FISHnCHIPs achieves this by simultaneously imaging ~2-35 co-expressed genes (clustered into modules) that are spatially co-localized in tissues, resulting in similar spatial information as single-gene Fluorescence In Situ Hybridization (FISH), but with ~2-20-fold higher sensitivity. Using FISHnCHIPs, we image up to 53 modules from the mouse kidney and mouse brain, and demonstrate high-speed, large field-of-view profiling of a whole tissue section. FISHnCHIPs also reveals spatially restricted localizations of cancer-associated fibroblasts in a human colorectal cancer biopsy. Overall, FISHnCHIPs enables fast, robust, and scalable cell typing of tissues with normal physiology or undergoing pathogenesis.


Subject(s)
Gene Expression Profiling , Transcriptome , Animals , Mice , Humans , In Situ Hybridization, Fluorescence/methods , Gene Expression Profiling/methods , Transcriptome/genetics
5.
Nat Genet ; 56(3): 431-441, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38413725

ABSTRACT

Spatial omics data are clustered to define both cell types and tissue domains. We present Building Aggregates with a Neighborhood Kernel and Spatial Yardstick (BANKSY), an algorithm that unifies these two spatial clustering problems by embedding cells in a product space of their own and the local neighborhood transcriptome, representing cell state and microenvironment, respectively. BANKSY's spatial feature augmentation strategy improved performance on both tasks when tested on diverse RNA (imaging, sequencing) and protein (imaging) datasets. BANKSY revealed unexpected niche-dependent cell states in the mouse brain and outperformed competing methods on domain segmentation and cell typing benchmarks. BANKSY can also be used for quality control of spatial transcriptomics data and for spatially aware batch effect correction. Importantly, it is substantially faster and more scalable than existing methods, enabling the processing of millions of cell datasets. In summary, BANKSY provides an accurate, biologically motivated, scalable and versatile framework for analyzing spatially resolved omics data.


Subject(s)
Algorithms , Benchmarking , Animals , Mice , Gene Expression Profiling , RNA , Transcriptome , Data Analysis
6.
Nat Commun ; 15(1): 567, 2024 Jan 18.
Article in English | MEDLINE | ID: mdl-38238298

ABSTRACT

Due to the paucity of longitudinal molecular studies of COVID-19, particularly those covering the early stages of infection (Days 1-8 symptom onset), our understanding of host response over the disease course is limited. We perform longitudinal single cell RNA-seq on 286 blood samples from 108 age- and sex-matched COVID-19 patients, including 73 with early samples. We examine discrete cell subtypes and continuous cell states longitudinally, and we identify upregulation of type I IFN-stimulated genes (ISGs) as the predominant early signature of subsequent worsening of symptoms, which we validate in an independent cohort and corroborate by plasma markers. However, ISG expression is dynamic in progressors, spiking early and then rapidly receding to the level of severity-matched non-progressors. In contrast, cross-sectional analysis shows that ISG expression is deficient and IFN suppressors such as SOCS3 are upregulated in severe and critical COVID-19. We validate the latter in four independent cohorts, and SOCS3 inhibition reduces SARS-CoV-2 replication in vitro. In summary, we identify complexity in type I IFN response to COVID-19, as well as a potential avenue for host-directed therapy.


Subject(s)
COVID-19 , Interferon Type I , Humans , Cross-Sectional Studies , SARS-CoV-2 , Up-Regulation
7.
Nat Biotechnol ; 2023 Aug 17.
Article in English | MEDLINE | ID: mdl-37592035

ABSTRACT

Single-cell omics technologies enable molecular characterization of diverse cell types and states, but how the resulting transcriptional and epigenetic profiles depend on the cell's genetic background remains understudied. We describe Monopogen, a computational tool to detect single-nucleotide variants (SNVs) from single-cell sequencing data. Monopogen leverages linkage disequilibrium from external reference panels to identify germline SNVs and detects putative somatic SNVs using allele cosegregating patterns at the cell population level. It can identify 100 K to 3 M germline SNVs achieving a genotyping accuracy of 95%, together with hundreds of putative somatic SNVs. Monopogen-derived genotypes enable global and local ancestry inference and identification of admixed samples. It identifies variants associated with cardiomyocyte metabolic levels and epigenomic programs. It also improves putative somatic SNV detection that enables clonal lineage tracing in primary human clonal hematopoiesis. Monopogen brings together population genetics, cell lineage tracing and single-cell omics to uncover genetic determinants of cellular processes.

8.
Gut ; 72(9): 1651-1663, 2023 09.
Article in English | MEDLINE | ID: mdl-36918265

ABSTRACT

OBJECTIVE: Gastric cancer (GC) is a leading cause of cancer mortality, with ARID1A being the second most frequently mutated driver gene in GC. We sought to decipher ARID1A-specific GC regulatory networks and examine therapeutic vulnerabilities arising from ARID1A loss. DESIGN: Genomic profiling of GC patients including a Singapore cohort (>200 patients) was performed to derive mutational signatures of ARID1A inactivation across molecular subtypes. Single-cell transcriptomic profiles of ARID1A-mutated GCs were analysed to examine tumour microenvironmental changes arising from ARID1A loss. Genome-wide ARID1A binding and chromatin profiles (H3K27ac, H3K4me3, H3K4me1, ATAC-seq) were generated to identify gastric-specific epigenetic landscapes regulated by ARID1A. Distinct cancer hallmarks of ARID1A-mutated GCs were converged at the genomic, single-cell and epigenomic level, and targeted by pharmacological inhibition. RESULTS: We observed prevalent ARID1A inactivation across GC molecular subtypes, with distinct mutational signatures and linked to a NFKB-driven proinflammatory tumour microenvironment. ARID1A-depletion caused loss of H3K27ac activation signals at ARID1A-occupied distal enhancers, but unexpectedly gain of H3K27ac at ARID1A-occupied promoters in genes such as NFKB1 and NFKB2. Promoter activation in ARID1A-mutated GCs was associated with enhanced gene expression, increased BRD4 binding, and reduced HDAC1 and CTCF occupancy. Combined targeting of promoter activation and tumour inflammation via bromodomain and NFKB inhibitors confirmed therapeutic synergy specific to ARID1A-genomic status. CONCLUSION: Our results suggest a therapeutic strategy for ARID1A-mutated GCs targeting both tumour-intrinsic (BRD4-assocatiated promoter activation) and extrinsic (NFKB immunomodulation) cancer phenotypes.


Subject(s)
Stomach Neoplasms , Transcription Factors , Humans , Transcription Factors/genetics , Transcription Factors/metabolism , Stomach Neoplasms/genetics , Stomach Neoplasms/therapy , Stomach Neoplasms/pathology , Nuclear Proteins/genetics , Epigenomics , Mutation , Tumor Microenvironment/genetics , DNA-Binding Proteins/genetics , Cell Cycle Proteins/genetics
9.
Blood ; 141(22): 2738-2755, 2023 06 01.
Article in English | MEDLINE | ID: mdl-36857629

ABSTRACT

Primary resistance to tyrosine kinase inhibitors (TKIs) is a significant barrier to optimal outcomes in chronic myeloid leukemia (CML), but factors contributing to response heterogeneity remain unclear. Using single-cell RNA (scRNA) sequencing, we identified 8 statistically significant features in pretreatment bone marrow, which correlated with either sensitivity (major molecular response or MMR) or extreme resistance to imatinib (eventual blast crisis [BC] transformation). Employing machine-learning, we identified leukemic stem cell (LSC) and natural killer (NK) cell gene expression profiles predicting imatinib response with >80% accuracy, including no false positives for predicting BC. A canonical erythroid-specifying (TAL1/KLF1/GATA1) regulon was a hallmark of LSCs from patients with MMR and was associated with erythroid progenitor [ERP] expansion in vivo (P < .05), and a 2- to 10-fold (6.3-fold in group A vs 1.09-fold in group C) erythroid over myeloid bias in vitro. Notably, ERPs demonstrated exquisite TKI sensitivity compared with myeloid progenitors (P < .001). These LSC features were lost with progressive resistance, and MYC- and IRF1-driven inflammatory regulons were evident in patients who progressed to transformation. Patients with MMR also exhibited a 56-fold expansion (P < .01) of a normally rare subset of hyperfunctional adaptive-like NK cells, which diminished with progressive resistance, whereas patients destined for BC accumulated inhibitory NKG2A+ NK cells favoring NK cell tolerance. Finally, we developed antibody panels to validate our scRNA-seq findings. These panels may be useful for prospective studies of primary resistance, and in assessing the contribution of predetermined vs acquired factors in TKI response heterogeneity.


Subject(s)
Leukemia, Myelogenous, Chronic, BCR-ABL Positive , Protein Kinase Inhibitors , Humans , Imatinib Mesylate/pharmacology , Imatinib Mesylate/therapeutic use , Prospective Studies , Protein Kinase Inhibitors/pharmacology , Protein Kinase Inhibitors/therapeutic use , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/drug therapy , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/genetics , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/metabolism , Blast Crisis , Drug Resistance, Neoplasm/genetics
10.
Nat Genet ; 55(2): 178-186, 2023 02.
Article in English | MEDLINE | ID: mdl-36658435

ABSTRACT

Precision medicine promises to transform healthcare for groups and individuals through early disease detection, refining diagnoses and tailoring treatments. Analysis of large-scale genomic-phenotypic databases is a critical enabler of precision medicine. Although Asia is home to 60% of the world's population, many Asian ancestries are under-represented in existing databases, leading to missed opportunities for new discoveries, particularly for diseases most relevant for these populations. The Singapore National Precision Medicine initiative is a whole-of-government 10-year initiative aiming to generate precision medicine data of up to one million individuals, integrating genomic, lifestyle, health, social and environmental data. Beyond technologies, routine adoption of precision medicine in clinical practice requires social, ethical, legal and regulatory barriers to be addressed. Identifying driver use cases in which precision medicine results in standardized changes to clinical workflows or improvements in population health, coupled with health economic analysis to demonstrate value-based healthcare, is a vital prerequisite for responsible health system adoption.


Subject(s)
Delivery of Health Care , Precision Medicine , Humans , Singapore , Precision Medicine/methods , Asia
12.
Nat Commun ; 13(1): 6694, 2022 11 05.
Article in English | MEDLINE | ID: mdl-36335097

ABSTRACT

Asian populations are under-represented in human genomics research. Here, we characterize clinically significant genetic variation in 9051 genomes representing East Asian, South Asian, and severely under-represented Austronesian-speaking Southeast Asian ancestries. We observe disparate genetic risk burden attributable to ancestry-specific recurrent variants and identify individuals with variants specific to ancestries discordant to their self-reported ethnicity, mostly due to cryptic admixture. About 27% of severe recessive disorder genes with appreciable carrier frequencies in Asians are missed by carrier screening panels, and we estimate 0.5% Asian couples at-risk of having an affected child. Prevalence of medically-actionable variant carriers is 3.4% and a further 1.6% harbour variants with potential for pathogenic classification upon additional clinical/experimental evidence. We profile 23 pharmacogenes with high-confidence gene-drug associations and find 22.4% of Asians at-risk of Centers for Disease Control and Prevention Tier 1 genetic conditions concurrently harbour pharmacogenetic variants with actionable phenotypes, highlighting the benefits of pre-emptive pharmacogenomics. Our findings illuminate the diversity in genetic disease epidemiology and opportunities for precision medicine for a large, diverse Asian population.


Subject(s)
Asian People , Genome, Human , Child , Humans , Asian People/genetics , Genome, Human/genetics , Ethnicity , Pharmacogenetics , Phenotype
14.
Mol Psychiatry ; 27(11): 4510-4525, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36056172

ABSTRACT

Depression and anxiety are major global health burdens. Although SSRIs targeting the serotonergic system are prescribed over 200 million times annually, they have variable therapeutic efficacy and side effects, and mechanisms of action remain incompletely understood. Here, we comprehensively characterise the molecular landscape of gene regulatory changes associated with fluoxetine, a widely-used SSRI. We performed multimodal analysis of SSRI response in 27 mammalian brain regions using 310 bulk RNA-seq and H3K27ac ChIP-seq datasets, followed by in-depth characterisation of two hippocampal regions using single-cell RNA-seq (20 datasets). Remarkably, fluoxetine induced profound region-specific shifts in gene expression and chromatin state, including in the nucleus accumbens shell, locus coeruleus and septal areas, as well as in more well-studied regions such as the raphe and hippocampal dentate gyrus. Expression changes were strongly enriched at GWAS loci for depression and antidepressant drug response, stressing the relevance to human phenotypes. We observed differential expression at dozens of signalling receptors and pathways, many of which are previously unknown. Single-cell analysis revealed stark differences in fluoxetine response between the dorsal and ventral hippocampal dentate gyri, particularly in oligodendrocytes, mossy cells and inhibitory neurons. Across diverse brain regions, integrative omics analysis consistently suggested increased energy metabolism via oxidative phosphorylation and mitochondrial changes, which we corroborated in vitro; this may thus constitute a shared mechanism of action of fluoxetine. Similarly, we observed pervasive chromatin remodelling signatures across the brain. Our study reveals unexpected regional and cell type-specific heterogeneity in SSRI action, highlights under-studied brain regions that may play a major role in antidepressant response, and provides a rich resource of candidate cell types, genes, gene regulatory elements and pathways for mechanistic analysis and identifying new therapeutic targets for depression and anxiety.


Subject(s)
Chromatin Assembly and Disassembly , Fluoxetine , Humans , Antidepressive Agents/pharmacology , Brain/metabolism , Energy Metabolism/genetics , Fluoxetine/pharmacology , Fluoxetine/metabolism , Mammals , Multiomics , Animals
15.
Nat Microbiol ; 7(2): 312-326, 2022 02.
Article in English | MEDLINE | ID: mdl-35102304

ABSTRACT

Host cell chromatin changes are thought to play an important role in the pathogenesis of infectious diseases. Here we describe a histone acetylome-wide association study (HAWAS) of an infectious disease, on the basis of genome-wide H3K27 acetylation profiling of peripheral blood granulocytes and monocytes from persons with active Mycobacterium tuberculosis (Mtb) infection and healthy controls. We detected >2,000 differentially acetylated loci in either cell type in a Singapore Chinese discovery cohort (n = 46), which were validated in a subsequent multi-ethnic Singapore cohort (n = 29), as well as a longitudinal cohort from South Africa (n = 26), thus demonstrating that HAWAS can be independently corroborated. Acetylation changes were correlated with differential gene expression. Differential acetylation was enriched near potassium channel genes, including KCNJ15, which modulates apoptosis and promotes Mtb clearance in vitro. We performed histone acetylation quantitative trait locus (haQTL) analysis on the dataset and identified 69 candidate causal variants for immune phenotypes among granulocyte haQTLs and 83 among monocyte haQTLs. Our study provides proof-of-principle for HAWAS to infer mechanisms of host response to pathogens.


Subject(s)
Genetic Association Studies , Histones/genetics , Mycobacterium tuberculosis/immunology , Tuberculosis/genetics , Tuberculosis/immunology , Acetylation , Adult , Chromatin , Cohort Studies , Female , Granulocytes/immunology , Histones/immunology , Humans , Longitudinal Studies , Male , Monocytes/immunology , Monocytes/microbiology , Proof of Concept Study , Quantitative Trait Loci , Singapore , South Africa , THP-1 Cells , Tuberculosis/microbiology , Young Adult
16.
mBio ; 13(1): e0343621, 2022 02 22.
Article in English | MEDLINE | ID: mdl-35038898

ABSTRACT

The dynamics of SARS-CoV-2 infection in COVID-19 patients are highly variable, with a subset of patients demonstrating prolonged virus shedding, which poses a significant challenge for disease management and transmission control. In this study, the long-term dynamics of SARS-CoV-2 infection were investigated using a human well-differentiated nasal epithelial cell (NEC) model of infection. NECs were observed to release SARS-CoV-2 virus onto the apical surface for up to 28 days postinfection (dpi), further corroborated by viral antigen staining. Single-cell transcriptome sequencing (sc-seq) was utilized to explore the host response from infected NECs after short-term (3-dpi) and long-term (28-dpi) infection. We identified a unique population of cells harboring high viral loads present at both 3 and 28 dpi, characterized by expression of cell stress-related genes DDIT3 and ATF3 and enriched for genes involved in tumor necrosis factor alpha (TNF-α) signaling and apoptosis. Remarkably, this sc-seq analysis revealed an antiviral gene signature within all NEC cell types even at 28 dpi. We demonstrate increased replication of basal cells, absence of widespread cell death within the epithelial monolayer, and the ability of SARS-CoV-2 to replicate despite a continuous interferon response as factors likely contributing to SARS-CoV-2 persistence. This study provides a model system for development of therapeutics aimed at improving viral clearance in immunocompromised patients and implies a crucial role for immune cells in mediating viral clearance from infected epithelia. IMPORTANCE Increasing medical attention has been drawn to the persistence of symptoms (long-COVID syndrome) or live virus shedding from subsets of COVID-19 patients weeks to months after the initial onset of symptoms. In vitro approaches to model viral or symptom persistence are needed to fully dissect the complex and likely varied mechanisms underlying these clinical observations. We show that in vitro differentiated human NECs are persistently infected with SARS-CoV-2 for up to 28 dpi. This viral replication occurred despite the presence of an antiviral gene signature across all NEC cell types even at 28 dpi. This indicates that epithelial cell intrinsic antiviral responses are insufficient for the clearance of SARS-CoV-2, implying an essential role for tissue-resident and infiltrating immune cells for eventual viral clearance from infected airway tissue in COVID-19 patients.


Subject(s)
COVID-19 , Humans , SARS-CoV-2 , Post-Acute COVID-19 Syndrome , Epithelial Cells , Antiviral Agents
17.
BMC Genomics ; 22(1): 789, 2021 Nov 03.
Article in English | MEDLINE | ID: mdl-34732136

ABSTRACT

BACKGROUND: Transposable elements (TE) comprise nearly half of the human genome and their insertions have profound effects to human genetic diversification and as well as disease. Despite their abovementioned significance, there is no consensus on the TE subfamilies that remain active in the human genome. In this study, we therefore developed a novel statistical test for recently mobile subfamilies (RMSs), based on patterns of overlap with > 100,000 polymorphic indels. RESULTS: Our analysis produced a catalogue of 20 high-confidence RMSs, which excludes many false positives in public databases. Intriguingly though, it includes HERV-K, an LTR subfamily previously thought to be extinct. The RMS catalogue is strongly enriched for contributions to germline genetic disorders (P = 1.1e-10), and thus constitutes a valuable resource for diagnosing disorders of unknown aetiology using targeted TE-insertion screens. Remarkably, RMSs are also highly enriched for somatic insertions in diverse cancers (P = 2.8e-17), thus indicating strong correlations between germline and somatic TE mobility. Using CRISPR/Cas9 deletion, we show that an RMS-derived polymorphic TE insertion increased the expression of RPL17, a gene associated with lower survival in liver cancer. More broadly, polymorphic TE insertions from RMSs were enriched near genes with allele-specific expression, suggesting widespread effects on gene regulation. CONCLUSIONS: By using a novel statistical test we have defined a catalogue of 20 recently mobile transposable element subfamilies. We illustrate the gene regulatory potential of RMS-derived polymorphic TE insertions, using CRISPR/Cas9 deletion in vitro on a specific candidate, as well as by genome wide analysis of allele-specific expression. Our study presents novel insights into TE mobility and regulatory potential and provides a key resource for human disease genetics and population history studies.


Subject(s)
DNA Transposable Elements , Endogenous Retroviruses , DNA Transposable Elements/genetics , Gene Expression Regulation , Genome, Human , Humans
18.
Cell Rep ; 37(7): 110022, 2021 11 16.
Article in English | MEDLINE | ID: mdl-34788620

ABSTRACT

Alternative splicing is a post-transcriptional regulatory mechanism producing distinct mRNA molecules from a single pre-mRNA with a prominent role in the development and function of the central nervous system. We used long-read isoform sequencing to generate full-length transcript sequences in the human and mouse cortex. We identify novel transcripts not present in existing genome annotations, including transcripts mapping to putative novel (unannotated) genes and fusion transcripts incorporating exons from multiple genes. Global patterns of transcript diversity are similar between human and mouse cortex, although certain genes are characterized by striking differences between species. We also identify developmental changes in alternative splicing, with differential transcript usage between human fetal and adult cortex. Our data confirm the importance of alternative splicing in the cortex, dramatically increasing transcriptional diversity and representing an important mechanism underpinning gene regulation in the brain. We provide transcript-level data for human and mouse cortex as a resource to the scientific community.


Subject(s)
Cerebral Cortex/metabolism , Protein Isoforms/genetics , Transcriptome/genetics , Alternative Splicing/genetics , Animals , Brain/metabolism , Cerebral Cortex/physiology , Exons/genetics , Gene Expression/genetics , Gene Expression Profiling/methods , Genome , High-Throughput Nucleotide Sequencing/methods , Humans , Mice , Protein Isoforms/metabolism , RNA Precursors/genetics , RNA Splice Sites/genetics , RNA, Messenger/genetics , Sequence Analysis, RNA/methods
19.
Nat Commun ; 12(1): 5849, 2021 10 06.
Article in English | MEDLINE | ID: mdl-34615861

ABSTRACT

Feature selection (marker gene selection) is widely believed to improve clustering accuracy, and is thus a key component of single cell clustering pipelines. Existing feature selection methods perform inconsistently across datasets, occasionally even resulting in poorer clustering accuracy than without feature selection. Moreover, existing methods ignore information contained in gene-gene correlations. Here, we introduce DUBStepR (Determining the Underlying Basis using Stepwise Regression), a feature selection algorithm that leverages gene-gene correlations with a novel measure of inhomogeneity in feature space, termed the Density Index (DI). Despite selecting a relatively small number of genes, DUBStepR substantially outperformed existing single-cell feature selection methods across diverse clustering benchmarks. Additionally, DUBStepR was the only method to robustly deconvolve T and NK heterogeneity by identifying disease-associated common and rare cell types and subtypes in PBMCs from rheumatoid arthritis patients. DUBStepR is scalable to over a million cells, and can be straightforwardly applied to other data types such as single-cell ATAC-seq. We propose DUBStepR as a general-purpose feature selection solution for accurately clustering single-cell data.


Subject(s)
Machine Learning , Single-Cell Analysis/methods , Algorithms , Arthritis, Rheumatoid , Chromatin Immunoprecipitation Sequencing , Cluster Analysis , Gene Expression , Genes, Mitochondrial , Humans , RNA-Seq , Research Design , Sequence Analysis, RNA , Software
20.
Nucleic Acids Res ; 49(15): 8505-8519, 2021 09 07.
Article in English | MEDLINE | ID: mdl-34320202

ABSTRACT

The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.


Subject(s)
Algorithms , Cluster Analysis , RNA, Small Cytoplasmic/genetics , RNA-Seq/methods , Single-Cell Analysis/methods , Animals , Arthritis, Rheumatoid/genetics , Bone Marrow Cells/metabolism , COVID-19/blood , COVID-19/pathology , Cohort Studies , Datasets as Topic , Humans , Leukocytes, Mononuclear/metabolism , Leukocytes, Mononuclear/pathology , Mice , Organ Specificity , Quality Control , RNA-Seq/standards , Single-Cell Analysis/standards , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL
...