Search | VHL Regional Portal

A Bayesian method to infer copy number clones from single-cell RNA and ATAC sequencing.

Patruno, Lucrezia; Milite, Salvatore; Bergamin, Riccardo; Calonaci, Nicola; D'Onofrio, Alberto; Anselmi, Fabio; Antoniotti, Marco; Graudenzi, Alex; Caravagna, Giulio.

PLoS Comput Biol ; 19(11): e1011557, 2023 Nov.

Article in English | MEDLINE | ID: mdl-37917660

ABSTRACT

Single-cell RNA and ATAC sequencing technologies enable the examination of gene expression and chromatin accessibility in individual cells, providing insights into cellular phenotypes. In cancer research, it is important to consistently analyze these states within an evolutionary context on genetic clones. Here we present CONGAS+, a Bayesian model to map single-cell RNA and ATAC profiles onto the latent space of copy number clones. CONGAS+ clusters cells into tumour subclones with similar ploidy, rendering straightforward to compare their expression and chromatin profiles. The framework, implemented on GPU and tested on real and simulated data, scales to analyse seamlessly thousands of cells, demonstrating better performance than single-molecule models, and supporting new multi-omics assays. In prostate cancer, lymphoma and basal cell carcinoma, CONGAS+ successfully identifies complex subclonal architectures while providing a coherent mapping between ATAC and RNA, facilitating the study of genotype-phenotype maps and their connection to genomic instability.

Subject(s)

DNA Copy Number Variations , RNA , RNA/genetics , Bayes Theorem , DNA Copy Number Variations/genetics , Clone Cells , High-Throughput Nucleotide Sequencing/methods , Chromatin

A Bayesian method to cluster single-cell RNA sequencing data using copy number alterations.

Milite, Salvatore; Bergamin, Riccardo; Patruno, Lucrezia; Calonaci, Nicola; Caravagna, Giulio.

Bioinformatics ; 38(9): 2512-2518, 2022 04 28.

Article in English | MEDLINE | ID: mdl-35298589

ABSTRACT

MOTIVATION: Cancers are composed by several heterogeneous subpopulations, each one harbouring different genetic and epigenetic somatic alterations that contribute to disease onset and therapy response. In recent years, copy number alterations (CNAs) leading to tumour aneuploidy have been identified as potential key drivers of such populations, but the definition of the precise makeup of cancer subclones from sequencing assays remains challenging. In the end, little is known about the mapping between complex CNAs and their effect on cancer phenotypes. RESULTS: We introduce CONGAS, a Bayesian probabilistic method to phase bulk DNA and single-cell RNA measurements from independent assays. CONGAS jointly identifies clusters of single cells with subclonal CNAs, and differences in RNA expression. The model builds statistical priors leveraging bulk DNA sequencing data, does not require a normal reference and scales fast thanks to a GPU backend and variational inference. We test CONGAS on both simulated and real data, and find that it can determine the tumour subclonal composition at the single-cell level together with clone-specific RNA phenotypes in tumour data generated from both 10× and Smart-Seq assays. AVAILABILITY AND IMPLEMENTATION: CONGAS is available as 2 packages: CONGAS (https://github.com/caravagnalab/congas), which implements the model in Python, and RCONGAS (https://caravagnalab.github.io/rcongas/), which provides R functions to process inputs, outputs and run CONGAS fits. The analysis of real data and scripts to generate figures of this paper are available via RCONGAS; code associated to simulations is available at https://github.com/caravagnalab/rcongas_test. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

DNA Copy Number Variations , Neoplasms , Humans , Bayes Theorem , Software , Sequence Analysis, RNA , RNA , Neoplasms/genetics , Single-Cell Analysis

A review of computational strategies for denoising and imputation of single-cell transcriptomic data.

Patruno, Lucrezia; Maspero, Davide; Craighero, Francesco; Angaroni, Fabrizio; Antoniotti, Marco; Graudenzi, Alex.

Brief Bioinform ; 22(4)2021 07 20.

Article in English | MEDLINE | ID: mdl-33003202

ABSTRACT

MOTIVATION: The advancements of single-cell sequencing methods have paved the way for the characterization of cellular states at unprecedented resolution, revolutionizing the investigation on complex biological systems. Yet, single-cell sequencing experiments are hindered by several technical issues, which cause output data to be noisy, impacting the reliability of downstream analyses. Therefore, a growing number of data science methods has been proposed to recover lost or corrupted information from single-cell sequencing data. To date, however, no quantitative benchmarks have been proposed to evaluate such methods. RESULTS: We present a comprehensive analysis of the state-of-the-art computational approaches for denoising and imputation of single-cell transcriptomic data, comparing their performance in different experimental scenarios. In detail, we compared 19 denoising and imputation methods, on both simulated and real-world datasets, with respect to several performance metrics related to imputation of dropout events, recovery of true expression profiles, characterization of cell similarity, identification of differentially expressed genes and computation time. The effectiveness and scalability of all methods were assessed with regard to distinct sequencing protocols, sample size and different levels of biological variability and technical noise. As a result, we identify a subset of versatile approaches exhibiting solid performances on most tests and show that certain algorithmic families prove effective on specific tasks but inefficient on others. Finally, most methods appear to benefit from the introduction of appropriate assumptions on noise distribution of biological processes.

Subject(s)

Gene Expression Profiling , RNA-Seq , Single-Cell Analysis , Software , Animals , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL