ABSTRACT
As SARS-CoV-2 continues to spread and evolve, detecting emerging variants early is critical for public health interventions. Inferring lineage prevalence by clinical testing is infeasible at scale, especially in areas with limited resources, participation, or testing/sequencing capacity, which can also introduce biases. SARS-CoV-2 RNA concentration in wastewater successfully tracks regional infection dynamics and provides less biased abundance estimates than clinical testing. Tracking virus genomic sequences in wastewater would improve community prevalence estimates and detect emerging variants. However, two factors limit wastewater-based genomic surveillance: low-quality sequence data and inability to estimate relative lineage abundance in mixed samples. Here, we resolve these critical issues to perform a high-resolution, 295-day wastewater and clinical sequencing effort, in the controlled environment of a large university campus and the broader context of the surrounding county. We develop and deploy improved virus concentration protocols and deconvolution software that fully resolve multiple virus strains from wastewater. We detect emerging variants of concern up to 14 days earlier in wastewater samples, and identify multiple instances of virus spread not captured by clinical genomic surveillance. Our study provides a scalable solution for wastewater genomic surveillance that allows early detection of SARS-CoV-2 variants and identification of cryptic transmission.
ABSTRACT
Large-scale wastewater surveillance has the ability to greatly augment the tracking of infection dynamics especially in communities where the prevalence rates far exceed the testing capacity. However, current methods for viral detection in wastewater are severely lacking in terms of scaling up for high throughput. In the present study, we employed an automated magnetic-bead based concentration approach for viral detection in sewage that can effectively be scaled up for processing 24 samples in a single 40-minute run. The method compared favorably to conventionally used methods for viral wastewater concentrations with higher recovery efficiencies from input sample volumes as low as 10ml and can enable the processing of over 100 wastewater samples in a day. The sensitivity of the high-throughput protocol was shown to detect cases as low as 2 in a hospital building with a known COVID-19 caseload. Using the high throughput pipeline, samples from the influent stream of the primary wastewater treatment plant of San Diego county (serving 2.3 million residents) were processed for a period of 13 weeks. Wastewater estimates of SARS-CoV-2 viral genome copies in raw untreated wastewater correlated strongly with clinically reported cases by the county, and when used alongside past reported case numbers and temporal information in an autoregressive integrated moving average (ARIMA) model enabled prediction of new reported cases up to 3 weeks in advance. Taken together, the results show that the high-throughput surveillance could greatly ameliorate comprehensive community prevalence assessments by providing robust, rapid estimates. ImportanceWastewater monitoring has a lot of potential for revealing COVID-19 outbreaks before they happen because the virus is found in the wastewater before people have clinical symptoms. However, application of wastewater-based surveillance has been limited by long processing times specifically at the concentration step. Here we introduce a much faster method of processing the samples, and show that its robustness by demonstrating direct comparisons with existing methods and showing that we can predict cases in San Diego by a week with excellent accuracy, and three weeks with fair accuracy, using city sewage. The automated viral concentration method will greatly alleviate the major bottleneck in wastewater processing by reducing the turnaround time during epidemics.