CellHeap: A Workflow for Optimizing COVID-19 Single-Cell RNA-Seq Data Processing in the Santos Dumont Supercomputer
14th Brazilian Symposium on Bioinformatics, BSB 2021
; 13063 LNBI:41-52, 2021.
Article
in English
| Scopus | ID: covidwho-1598129
ABSTRACT
Currently, several hundreds of Terabytes of COVID-19 single-cell RNA-seq (scRNA-seq) data are available in public repositories. This data refers to multiple tissues, comorbidities, and conditions. We expect this trend to continue, and it is realistic to predict amounts of COVID-19 scRNA-seq data increasing to several Petabytes in the coming years. However, thoughtful analysis of this data requires large-scale computing infrastructures, and software systems optimized for such platforms to generate biological knowledge. This paper presents CellHeap, a portable and robust workflow for scRNA-seq customizable analyses, with quality control throughout the execution steps and deployable on supercomputers. Furthermore, we present the deployment of CellHeap in the Santos Dumont supercomputer for analyzing COVID-19 scRNA-seq datasets, and discuss a case study that processed dozens of Terabytes of COVID-19 scRNA-seq raw data. © 2021, Springer Nature Switzerland AG.
Full text:
Available
Collection:
Databases of international organizations
Database:
Scopus
Language:
English
Journal:
14th Brazilian Symposium on Bioinformatics, BSB 2021
Year:
2021
Document Type:
Article
Similar
MEDLINE
...
LILACS
LIS