Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Comput Sci ; 4(5): 346-359, 2024 May.
Article in English | MEDLINE | ID: mdl-38730185

ABSTRACT

Single-cell epigenomic data has been growing continuously at an unprecedented pace, but their characteristics such as high dimensionality and sparsity pose substantial challenges to downstream analysis. Although deep learning models-especially variational autoencoders-have been widely used to capture low-dimensional feature embeddings, the prevalent Gaussian assumption somewhat disagrees with real data, and these models tend to struggle to incorporate reference information from abundant cell atlases. Here we propose CASTLE, a deep generative model based on the vector-quantized variational autoencoder framework to extract discrete latent embeddings that interpretably characterize single-cell chromatin accessibility sequencing data. We validate the performance and robustness of CASTLE for accurate cell-type identification and reasonable visualization compared with state-of-the-art methods. We demonstrate the advantages of CASTLE for effective incorporation of existing massive reference datasets in a weakly supervised or supervised manner. We further demonstrate CASTLE's capacity for intuitively distilling cell-type-specific feature spectra that unveil cell heterogeneity and biological implications quantitatively.


Subject(s)
Chromatin , Single-Cell Analysis , Single-Cell Analysis/methods , Chromatin/genetics , Chromatin/metabolism , Humans , Epigenomics/methods , Deep Learning , Algorithms , Genetic Heterogeneity
3.
Nat Commun ; 15(1): 1629, 2024 Feb 22.
Article in English | MEDLINE | ID: mdl-38388573

ABSTRACT

Single-cell chromatin accessibility sequencing (scCAS) has emerged as a valuable tool for interrogating and elucidating epigenomic heterogeneity and gene regulation. However, scCAS data inherently suffers from limitations such as high sparsity and dimensionality, which pose significant challenges for downstream analyses. Although several methods are proposed to enhance scCAS data, there are still challenges and limitations that hinder the effectiveness of these methods. Here, we propose scCASE, a scCAS data enhancement method based on non-negative matrix factorization which incorporates an iteratively updating cell-to-cell similarity matrix. Through comprehensive experiments on multiple datasets, we demonstrate the advantages of scCASE over existing methods for scCAS data enhancement. The interpretable cell type-specific peaks identified by scCASE can provide valuable biological insights into cell subpopulations. Moreover, to leverage the large compendia of available omics data as a reference, we further expand scCASE to scCASER, which enables the incorporation of external reference data to improve enhancement performance.


Subject(s)
Algorithms , Chromatin , Chromatin/genetics , Epigenomics/methods , Gene Expression Regulation , Single-Cell Analysis
5.
Nucleic Acids Res ; 49(W1): W483-W490, 2021 07 02.
Article in English | MEDLINE | ID: mdl-33999180

ABSTRACT

Chromatin accessibility, as a powerful marker of active DNA regulatory elements, provides valuable information for understanding regulatory mechanisms. The revolution in high-throughput methods has accumulated massive chromatin accessibility profiles in public repositories. Nevertheless, utilization of these data is hampered by cumbersome collection, time-consuming processing, and manual chromatin accessibility (openness) annotation of genomic regions. To fill this gap, we developed OpenAnnotate (http://health.tsinghua.edu.cn/openannotate/) as the first web server for efficiently annotating openness of massive genomic regions across various biosample types, tissues, and biological systems. In addition to the annotation resource from 2729 comprehensive profiles of 614 biosample types of human and mouse, OpenAnnotate provides user-friendly functionalities, ultra-efficient calculation, real-time browsing, intuitive visualization, and elaborate application notebooks. We show its unique advantages compared to existing databases and toolkits by effectively revealing cell type-specificity, identifying regulatory elements and 3D chromatin contacts, deciphering gene functional relationships, inferring functions of transcription factors, and unprecedentedly promoting single-cell data analyses. We anticipate OpenAnnotate will provide a promising avenue for researchers to construct a more holistic perspective to understand regulatory mechanisms.


Subject(s)
Chromatin/metabolism , Genomics/methods , Molecular Sequence Annotation/methods , Software , Internet , Regulatory Sequences, Nucleic Acid , Single-Cell Analysis , Transcription Factors/metabolism
6.
Nucleic Acids Res ; 49(D1): D221-D228, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33045745

ABSTRACT

Gene regulatory elements, including promoters, enhancers, silencers, etc., control transcriptional programs in a spatiotemporal manner. Though these elements are known to be able to induce either positive or negative transcriptional control, the community has been mostly studying enhancers which amplify transcription initiation, with less emphasis given to silencers which repress gene expression. To facilitate the study of silencers and the investigation of their potential roles in transcriptional control, we developed SilencerDB (http://health.tsinghua.edu.cn/silencerdb/), a comprehensive database of silencers by manually curating silencers from 2300 published articles. The current version, SilencerDB 1.0, contains (1) 33 060 validated silencers from experimental methods, and (ii) 5 045 547 predicted silencers from state-of-the-art machine learning methods. The functionality of SilencerDB includes (a) standardized categorization of silencers in a tree-structured class hierarchy based on species, organ, tissue and cell line and (b) comprehensive annotations of silencers with the nearest gene and potential regulatory genes. SilencerDB, to the best of our knowledge, is the first comprehensive database at this scale dedicated to silencers, with reliable annotations and user-friendly interactive database features. We believe this database has the potential to enable advanced understanding of silencers in regulatory mechanisms and to empower researchers to devise diverse applications of silencers in disease development.


Subject(s)
Databases, Nucleic Acid , Machine Learning , Silencer Elements, Transcriptional , Transcription, Genetic , User-Computer Interface , Animals , Buffaloes/genetics , Cell Line , Chickens/genetics , Drosophila melanogaster/genetics , Humans , Internet , Mice , Molecular Sequence Annotation , Organ Specificity , Rats , Sus scrofa/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...