Your browser doesn't support javascript.
loading
Batch-Corrected Distance Mitigates Temporal and Spatial Variability for Clustering and Visualization of Single-Cell Gene Expression Data
Shaoheng Liang; Jinzhuang Dou; Ramiz Iqbal; Ken Chen.
Afiliação
  • Shaoheng Liang; The University of Texas MD Anderson Cancer Center
  • Jinzhuang Dou; The University of Texas MD Anderson Cancer Center
  • Ramiz Iqbal; The University of Texas MD Anderson Cancer Center
  • Ken Chen; The University of Texas MD Anderson Cancer Center
Preprint em Inglês | bioRxiv | ID: ppbiorxiv-332080
ABSTRACT
Clustering and visualization are essential parts of single-cell gene expression data analysis. The Euclidean distance used in most distance-based methods is not optimal. Batch effect, i.e., the variability among samples gathered from different times, tissues, and patients, introduces large between-group distance and obscures the true identities of cells. To solve this problem, we introduce Batch-Corrected Distance (BCD), a metric using temporal/spatial locality of the batch effect to control for such factors. We validate BCD on a simulated data as well as applied it to a mouse retina development dataset and a lung dataset. We also found the utility of our approach in understanding the progression of the Coronavirus Disease 2019 (COVID-19). BCD achieves more accurate clusters and better visualizations than state-of-the-art batch correction methods on longitudinal datasets. BCD can be directly integrated with most clustering and visualization methods to enable more scientific findings.
Licença
cc_by
Texto completo: Disponível Coleções: Preprints Base de dados: bioRxiv Tipo de estudo: Estudo prognóstico Idioma: Inglês Ano de publicação: 2020 Tipo de documento: Preprint
Texto completo: Disponível Coleções: Preprints Base de dados: bioRxiv Tipo de estudo: Estudo prognóstico Idioma: Inglês Ano de publicação: 2020 Tipo de documento: Preprint
...