RESUMO
Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the effectiveness of interventions. Asymptomatic breakthrough infections have been a major problem during the ongoing surge of Delta variant globally. Serological discrimination of vaccine response from infection has so far been limited to Spike protein vaccines used in the higher-income regions. Here, we show for the first time how statistical and machine learning (ML) approaches can discriminate SARS-CoV-2 infection from immune response to an inactivated whole virion vaccine (BBV152, Covaxin, India), thereby permitting real-world vaccine effectiveness assessments from cohort-based serosurveys in Asia and Africa where such vaccines are commonly used. Briefly, we accessed serial data on Anti-S and Anti-NC antibody concentration values, along with age, sex, number of doses, and number of days since the last vaccine dose for 1823 Covaxin recipients. An ensemble ML model, incorporating a consensus clustering approach alongside the support vector machine (SVM) model, was built on 1063 samples where reliable qualifying data existed, and then applied to the entire dataset. Of 1448 self-reported negative subjects, 724 were classified as infected. Since the vaccine contains wild-type virus and the antibodies induced will neutralize wild type much better than Delta variant, we determined the relative ability of a random subset of such samples to neutralize Delta versus wild type strain. In 100 of 156 samples, where ML prediction differed from self-reported uninfected status, Delta variant, was neutralized more effectively than the wild type, which cannot happen without infection. The fraction rose to 71.8% (28 of 39) in subjects predicted to be infected during the surge, which is concordant with the percentage of sequences classified as Delta (75.6%-80.2%) over the same period.
RESUMO
Emergence of distinct viral clades has been observed in SARS-CoV2 variants across the world and India. Identification of the genomic diversity and the phylodynamic profiles of the prevalent strains of the country are critical to understand the evolution and spread of the variants. We performed whole-genome sequencing of 54 SARS-CoV2 strains collected from COVID-19 patients in Kolkata, West Bengal during August to October 2020. Phylogeographic and phylodynamic analyses were performed using these 54 and other sequences from India and abroad available in GISAID database. Spatio-temporal evolutionary dynamics of the pathogen across various regions and states of India over three different time periods in the year 2020 were analyzed. We estimated the clade dynamics of the Indian strains and compared the clade specific mutations and the co-mutation patterns across states and union territories of India over the time course. We observed that GR, GH and G (GISAID) or 20B and 20A (Nextstrain) clades were the prevalent clades in India during middle and later half of the year 2020. However, frequent mutations and co-mutations observed within the major clades across time periods do not show much overlap, indicating emergence of newer mutations in the viral population prevailing in the country. Further, we explored the possible association of specific mutations and co-mutations with the infection outcomes manifested within the Indian patients.
RESUMO
To understand the spread of SARS-CoV2, in August and September 2020, the Council of Scientific and Industrial Research (India), conducted a sero-survey across its constituent laboratories and centers across India. Of 10,427 volunteers, 1058 (10.14%) tested positive for SARS CoV2 anti-nucleocapsid (anti-NC) antibodies; 95% with surrogate neutralization activity. Three-fourth recalled no symptoms. Repeat serology tests at 3 (n=346) and 6 (n=35) months confirmed stability of antibody response and neutralization potential. Local sero-positivity was higher in densely populated cities and was inversely correlated with a 30 day change in regional test positivity rates (TPR). Regional seropositivity above 10% was associated with declining TPR. Personal factors associated with higher odds of sero-positivity were high-exposure work (Odds Ratio, 95% CI, p value; 2{middle dot}23, 1{middle dot}92-2{middle dot}59, 6{middle dot}5E-26), use of public transport (1{middle dot}79, 1{middle dot}43-2{middle dot}24, 2{middle dot}8E-06), not smoking (1{middle dot}52, 1{middle dot}16-1{middle dot}99, 0{middle dot}02), non-vegetarian diet (1{middle dot}67, 1{middle dot}41-1{middle dot}99, 3{middle dot}0E-08), and B blood group (1{middle dot}36,1{middle dot}15-1{middle dot}61, 0{middle dot}001). Impact StatementWidespread asymptomatic and undetected SARS-CoV2 infection affected more than a 100 million Indians by September 2020. Declining new cases thereafter may be due to persisting humoral immunity amongst sub-communities with high exposure. FundingCouncil of Scientific and Industrial Research, India (CSIR)