Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters











Database
Language
Publication year range
1.
Preprint in English | medRxiv | ID: ppmedrxiv-22273922

ABSTRACT

We previously interrogated the relationship between SARS-CoV-2 genetic mutations and associated patient outcomes using publicly available data downloaded from GISAID in October 2020 [1]. Using high-level patient data included in some GISAID submissions, we were able to aggregate patient status values and differentiate between severe and mild COVID-19 outcomes. In our previous publication, we utilized a logistic regression model with an L1 penalty (Lasso regularization) and found several statistically significant associations between genetic mutations and COVID-19 severity. In this work, we explore the applicability of our October 2020 findings to a more current phase of the COVID-19 pandemic. Here we first test our previous models on newer GISAID data downloaded in October 2021 to evaluate the classification ability of each model on expanded datasets. The October 2021 dataset (n=53,787 samples) is approximately 15 times larger than our October 2020 dataset (n=3,637 samples). We show limitations in using a supervised learning approach and a need for expansion of the feature sets based on progression of the COVID-19 pandemic, such as vaccination status. We then re-train on the newer GISAID data and compare the performance of our two logistic regression models. Based on accuracy and Area Under the Curve (AUC) metrics, we find that the AUC of the re-trained October 2021 model is modestly decreased as compared to the October 2020 model. These results are consistent with the increased emergence of multiple mutations, each with a potentially smaller impact on COVID-19 patient outcomes. Bioinformatics scripts used in this study are available at https://github.com/JPEO-CBRND/opendata-variant-analysis. As described in Voss et al. 2021, machine learning scripts are available at https://github.com/Digital-Biobank/covid_variant_severity.

2.
Preprint in English | medRxiv | ID: ppmedrxiv-21266688

ABSTRACT

The 2019 coronavirus disease (COVID-19) pandemic has demonstrated the importance of predicting, identifying, and tracking mutations throughout a pandemic event. As the COVID-19 global pandemic surpassed one year, several variants had emerged resulting in increased severity and transmissibility. In order to reduce the impact on human life, it is critical to rapidly identify which genetic variants result in increased virulence or transmission. To address the former, we evaluated if a genome-based predictive algorithm designed to predict clinical severity could predict polymerase chain reaction (PCR) results, as a surrogate for viral load and severity. Using a previously published algorithm, we compared the viral genome-based severity predictions to clinically-derived PCR-based viral load of 716 viral genomes. For those samples predicted to be "severe" (predicted severity score > 0.5), we observed an average cycle threshold (Ct) of 18.3, whereas those in in the "mild" category (severity prediction < 0.5) had an average Ct of 20.4 (P = 0.0017). We found a non-trivial correlation between predicted severity probability and cycle threshold (r = -0.199). Additionally, when divided into quartiles by prediction severity probability, the most probable quartile ([≥]75% probability) had a Ct of 16.6 (n=10) as compared to those least probable to be severe (<25%) of 21.4 (n=350) (P = 0.0045). Taken together, our results suggest that the severity predicted by a genome-based algorithm can be related to the metrics from the clinical diagnostic test, and that relative severity may be inferred from diagnostic values.

3.
Preprint in English | medRxiv | ID: ppmedrxiv-20242149

ABSTRACT

IntroductionThe coronavirus disease 2019 (COVID-19) pandemic is a global public health emergency causing a disparate burden of death and disability around the world. The molecular characteristics of the virus that predict better or worse outcome are largely still being discovered. MethodsWe downloaded 155,958 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from GISAID and evaluated whether variants improved prediction of reported severity beyond age and region. We also evaluated specific variants to determine the magnitude of association with severity and the frequency of these variants among the genomes. ResultsLogistic regression models that included viral genomic variants outperformed other models (AUC=0.91 as compared with 0.68 for age and gender alone; p<0.001). Among individual variants, we found 17 single nucleotide variants in SARS-CoV-2 have more than two-fold greater odds of being associated with higher severity and 67 variants associated with [≤] 0.5 times the odds of severity. The median frequency of associated variants was 0.15% (interquartile range 0.09%-0.45%). Altogether 85% of genomes had at least one variant associated with patient outcome. ConclusionNumerous SARS-CoV-2 variants have two-fold or greater association with odds of mild or severe outcome and collectively, these variants are common. In addition to comprehensive mitigation efforts, public health measures should be prioritized to control the more severe manifestations of COVID-19 and the transmission chains linked to these severe cases.

SELECTION OF CITATIONS
SEARCH DETAIL