Your browser doesn't support javascript.
A Parallelization Strategy for the Time Efficient Analysis of Thousands of LC/MS Runs in High-Performance Computing Environment.
van Zalm, Patrick; Viodé, Arthur; Smolen, Kinga; Fatou, Benoit; Hayati, Arash Nemati; Schlaffner, Christoph N; Levy, Ofer; Steen, Judith; Steen, Hanno.
  • van Zalm P; Department of Pathology, Boston Children's Hospital, and Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, United States.
  • Viodé A; Department of Neuropsychology and Psychopharmacology, EURON, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht 6229ER, The Netherlands.
  • Smolen K; Department of Pathology, Boston Children's Hospital, and Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, United States.
  • Fatou B; Precision Vaccines Program, Boston Children's Hospital, Boston, Massachusetts 02115, United States.
  • Hayati AN; Harvard Medical School, Boston, Massachusetts 02115, United States.
  • Schlaffner CN; Department of Pathology, Boston Children's Hospital, and Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, United States.
  • Levy O; Precision Vaccines Program, Boston Children's Hospital, Boston, Massachusetts 02115, United States.
  • Steen J; Research Computing, Information Services Department, Boston Children's Hospital, Boston, Massachusetts 02115, United States.
  • Steen H; F.M. Kirby Neurobiology Center, Boston Children's Hospital, and Department of Neurology, Harvard Medical School, Boston, Massachusetts 02115, United States.
J Proteome Res ; 21(11): 2810-2814, 2022 Nov 04.
Article in English | MEDLINE | ID: covidwho-2050250
ABSTRACT
Combining robust proteomics instrumentation with high-throughput enabling liquid chromatography (LC) systems (e.g., timsTOF Pro and the Evosep One system, respectively) enabled mapping the proteomes of 1000s of samples. Fragpipe is one of the few computational protein identification and quantification frameworks that allows for the time-efficient analysis of such large data sets. However, it requires large amounts of computational power and data storage space that leave even state-of-the-art workstations underpowered when it comes to the analysis of proteomics data sets with 1000s of LC mass spectrometry runs. To address this issue, we developed and optimized a Fragpipe-based analysis strategy for a high-performance computing environment and analyzed 3348 plasma samples (6.4 TB) that were longitudinally collected from hospitalized COVID-19 patients under the auspice of the Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study. Our parallelization strategy reduced the total runtime by ∼90% from 116 (theoretical) days to just 9 days in the high-performance computing environment. All code is open-source and can be deployed in any Simple Linux Utility for Resource Management (SLURM) high-performance computing environment, enabling the analysis of large-scale high-throughput proteomics studies.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: COVID-19 Type of study: Cohort study / Observational study / Prognostic study Limits: Humans Language: English Journal: J Proteome Res Journal subject: Biochemistry Year: 2022 Document Type: Article Affiliation country: Acs.jproteome.2c00278

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: COVID-19 Type of study: Cohort study / Observational study / Prognostic study Limits: Humans Language: English Journal: J Proteome Res Journal subject: Biochemistry Year: 2022 Document Type: Article Affiliation country: Acs.jproteome.2c00278