Clinical outcomes for patients with COVID-19 are heterogeneous and there is interest in defining subgroups for prognostic modeling and development of treatment algorithms. We obtained 28 demographic and laboratory variables in patients admitted to hospital with COVID-19. These comprised a training cohort (n = 6099) and two validation cohorts during the first and second waves of the pandemic (n = 996; n = 1011). Uniform manifold approximation and projection (UMAP) dimension reduction and Gaussian mixture model (GMM) analysis was used to define patient clusters. 29 clusters were defined in the training cohort and associated with markedly different mortality rates, which were predictive within confirmation datasets. Deconvolution of clinical features within clusters identified unexpected relationships between variables. Integration of large datasets using UMAP-assisted clustering can therefore identify patient subgroups with prognostic information and uncovers unexpected interactions between clinical variables. This application of machine learning represents a powerful approach for delineating disease pathogenesis and potential therapeutic interventions.

Background: Age and frailty are risk factors for poor clinical outcomes following SARS-CoV-2 infection. As such, COVID-19 vaccination has been prioritised for this group but there is concern that immune responses may be impaired due to immune senescence and co-morbidity. Methods: We studied antibody and cellular immune responses following COVID-19 vaccination in 202 staff and 286 residents of long-term care facilities (LTCF). Due to the high prevalence of previous infection within this environment 50% and 51% of these two groups respectively had serological evidence of prior natural SARS-CoV-2 infection. Results: In both staff and residents with previous infection the antibody responses following dual vaccination were strong and equivalent across the age course. In contrast, within infection-naïve donors these responses were reduced by 2.4-fold and 8.1-fold respectively such that values within the resident population were 2.6-fold lower than in staff. Impaired neutralisation of delta variant spike binding was also apparent within donors without prior infection. Spike-specific T cell responses were also markedly enhanced by prior infection and within infection-naive donors were 52% lower within residents compared to staff. Post-vaccine spike-specific CD4+ T cell responses displayed single or dual production of IFN-γ+ and IL-2+ whilst previous infection primed for an extended functional profile with TNF-ɑ+ and CXCL10 production. Interpretation: These data reveal suboptimal post-vaccine immune responses within infection-naïve elderly residents of LTCF and indicate the need for further optimization of immune protection through the use of booster vaccination.