ABSTRACT
The Alpha, Beta and Gamma SARS-CoV-2 Variants of Concern (VOCs) co-circulated globally during 2020-21, fueling waves of infections. They were displaced by Delta during a third wave worldwide in 2021, in turn displaced by Omicron in late 2021. In this study, we use phylogenetic and phylogeographic methods to reconstruct the dispersal patterns of VOCs worldwide. We find that source-sink dynamics varied substantially by VOC, and identify countries that acted as global and regional hubs of dissemination. We demonstrate a declining role of presumed origin countries of VOCs to their global dispersal, estimating that India contributed <15% of Delta exports and South Africa <1-2% of Omicron dispersal. We estimate that >80 countries had received introductions of Omicron within 100 days of emergence, associated with accelerating passenger air travel and higher transmissibility. Our study highlights the rapid dispersal of highly transmissible variants with implications for genomic surveillance along the hierarchical airline network. Graphical Data analysis clarifies that dispersal of SARS-CoV-2 variants from their sites of initial detection was related to the amount of global air travel at the time of the variant's emergence, and that travel volume through "hub” sites distinct from the site of emergence was a key driver of variant spread.
ABSTRACT
Ethiopia is the second most populous country in Africa and the sixth most affected by COVID-19 on the continent. Despite having experienced five infection waves, >499,000 cases, and ~7500 COVID-19-related deaths as of January 2023, there is still no detailed genomic epidemiological report on the introduction and spread of SARS-CoV-2 in Ethiopia. In this study, we reconstructed and elucidated the COVID-19 epidemic dynamics. Specifically, we investigated the introduction, local transmission, ongoing evolution, and spread of SARS-CoV-2 during the first four infection waves using 353 high-quality near-whole genomes sampled in Ethiopia. Our results show that whereas viral introductions seeded the first wave, subsequent waves were seeded by local transmission. The B.1.480 lineage emerged in the first wave and notably remained in circulation even after the emergence of the Alpha variant. The B.1.480 was outcompeted by the Delta variant. Notably, Ethiopia's lack of local sequencing capacity was further limited by sporadic, uneven, and insufficient sampling that limited the incorporation of genomic epidemiology in the epidemic public health response in Ethiopia. These results highlight Ethiopia's role in SARS-CoV-2 dissemination and the urgent need for balanced, near-real-time genomic sequencing.
Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Molecular Epidemiology , SARS-CoV-2/genetics , Ethiopia/epidemiology , COVID-19/epidemiology , COVID-19/geneticsABSTRACT
An important unmet need revealed by the COVID-19 pandemic is the near-real-time identification of potentially fitness-altering mutations within rapidly growing SARS-CoV-2 lineages. Although powerful molecular sequence analysis methods are available to detect and characterize patterns of natural selection within modestly sized gene-sequence datasets, the computational complexity of these methods and their sensitivity to sequencing errors render them effectively inapplicable in large-scale genomic surveillance contexts. Motivated by the need to analyze new lineage evolution in near-real time using large numbers of genomes, we developed the Rapid Assessment of Selection within CLades (RASCL) pipeline. RASCL applies state of the art phylogenetic comparative methods to evaluate selective processes acting at individual codon sites and across whole genes. RASCL is scalable and produces automatically updated regular lineage-specific selection analysis reports: even for lineages that include tens or hundreds of thousands of sampled genome sequences. Key to this performance is (i) generation of automatically subsampled high quality datasets of gene/ORF sequences drawn from a selected "query" viral lineage; (ii) contextualization of these query sequences in codon alignments that include high-quality "background" sequences representative of global SARS-CoV-2 diversity; and (iii) the extensive parallelization of a suite of computationally intensive selection analysis tests. Within hours of being deployed to analyze a novel rapidly growing lineage of interest, RASCL will begin yielding JavaScript Object Notation (JSON)-formatted reports that can be either imported into third-party analysis software or explored in standard web-browsers using the premade RASCL interactive data visualization dashboard. By enabling the rapid detection of genome sites evolving under different selective regimes, RASCL is well-suited for near-real-time monitoring of the population-level selective processes that will likely underlie the emergence of future variants of concern in measurably evolving pathogens with extensive genomic surveillance.
Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Pandemics , COVID-19/epidemiology , COVID-19/genetics , Phylogeny , Codon/genetics , Sequence Analysis , Genome, ViralABSTRACT
Investment in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing in Africa over the past year has led to a major increase in the number of sequences that have been generated and used to track the pandemic on the continent, a number that now exceeds 100,000 genomes. Our results show an increase in the number of African countries that are able to sequence domestically and highlight that local sequencing enables faster turnaround times and more-regular routine surveillance. Despite limitations of low testing proportions, findings from this genomic surveillance study underscore the heterogeneous nature of the pandemic and illuminate the distinct dispersal dynamics of variants of concern-particularly Alpha, Beta, Delta, and Omicron-on the continent. Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve while the continent faces many emerging and reemerging infectious disease threats. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century.
Subject(s)
COVID-19 , Epidemiological Monitoring , Pandemics , SARS-CoV-2 , Africa/epidemiology , COVID-19/epidemiology , COVID-19/virology , Genomics , Humans , SARS-CoV-2/geneticsABSTRACT
COVID-19 was first diagnosed in Egypt on 14 February 2020. By the end of November 2021, over 333,840 cases and 18,832 deaths had been reported. As part of the national genomic surveillance, 1027 SARS-CoV-2 near whole-genomes were generated and published by the end of July 2021. Here we describe the genomic epidemiology of SARS-CoV-2 in Egypt over this period using a subset of 976 high-quality Egyptian genomes analyzed together with a representative set of global sequences within a phylogenetic framework. A single lineage, C.36, introduced early in the pandemic was responsible for most of the cases in Egypt. Furthermore, to remain dominant in the face of mounting immunity from previous infections and vaccinations, this lineage acquired several mutations known to confer an adaptive advantage. These results highlight the value of continuous genomic surveillance in regions where VOCs are not predominant and the need for enforcement of public health measures to prevent expansion of the existing lineages.
Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , Egypt/epidemiology , Humans , Mutation , Pandemics , Phylogeny , SARS-CoV-2/geneticsABSTRACT
Recombination contributes to the genetic diversity found in coronaviruses and is known to be a prominent mechanism whereby they evolve. It is apparent, both from controlled experiments and in genome sequences sampled from nature, that patterns of recombination in coronaviruses are non-random and that this is likely attributable to a combination of sequence features that favour the occurrence of recombination break points at specific genomic sites, and selection disfavouring the survival of recombinants within which favourable intra-genome interactions have been disrupted. Here we leverage available whole-genome sequence data for six coronavirus subgenera to identify specific patterns of recombination that are conserved between multiple subgenera and then identify the likely factors that underlie these conserved patterns. Specifically, we confirm the non-randomness of recombination break points across all six tested coronavirus subgenera, locate conserved recombination hot- and cold-spots, and determine that the locations of transcriptional regulatory sequences are likely major determinants of conserved recombination break-point hotspot locations. We find that while the locations of recombination break points are not uniformly associated with degrees of nucleotide sequence conservation, they display significant tendencies in multiple coronavirus subgenera to occur in low guanine-cytosine content genome regions, in non-coding regions, at the edges of genes, and at sites within the Spike gene that are predicted to be minimally disruptive of Spike protein folding. While it is apparent that sequence features such as transcriptional regulatory sequences are likely major determinants of where the template-switching events that yield recombination break points most commonly occur, it is evident that selection against misfolded recombinant proteins also strongly impacts observable recombination break-point distributions in coronavirus genomes sampled from nature.
ABSTRACT
Three lineages (BA.1, BA.2 and BA.3) of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron variant of concern predominantly drove South Africa's fourth Coronavirus Disease 2019 (COVID-19) wave. We have now identified two new lineages, BA.4 and BA.5, responsible for a fifth wave of infections. The spike proteins of BA.4 and BA.5 are identical, and similar to BA.2 except for the addition of 69-70 deletion (present in the Alpha variant and the BA.1 lineage), L452R (present in the Delta variant), F486V and the wild-type amino acid at Q493. The two lineages differ only outside of the spike region. The 69-70 deletion in spike allows these lineages to be identified by the proxy marker of S-gene target failure, on the background of variants not possessing this feature. BA.4 and BA.5 have rapidly replaced BA.2, reaching more than 50% of sequenced cases in South Africa by the first week of April 2022. Using a multinomial logistic regression model, we estimated growth advantages for BA.4 and BA.5 of 0.08 (95% confidence interval (CI): 0.08-0.09) and 0.10 (95% CI: 0.09-0.11) per day, respectively, over BA.2 in South Africa. The continued discovery of genetically diverse Omicron lineages points to the hypothesis that a discrete reservoir, such as human chronic infections and/or animal hosts, is potentially contributing to further evolution and dispersal of the virus.
Subject(s)
COVID-19 , SARS-CoV-2 , Amino Acids , Animals , COVID-19/epidemiology , Humans , SARS-CoV-2/genetics , South Africa/epidemiology , Spike Glycoprotein, Coronavirus/geneticsABSTRACT
Recombination is an evolutionary process by which many pathogens generate diversity and acquire novel functions. Although a common occurrence during coronavirus replication, detection of recombination is only feasible when genetically distinct viruses contemporaneously infect the same host. Here, we identify an instance of SARS-CoV-2 superinfection, whereby an individual was infected with two distinct viral variants: Alpha (B.1.1.7) and Epsilon (B.1.429). This superinfection was first noted when an Alpha genome sequence failed to exhibit the classic S gene target failure behavior used to track this variant. Full genome sequencing from four independent extracts reveals that Alpha variant alleles comprise around 75% of the genomes, whereas the Epsilon variant alleles comprise around 20% of the sample. Further investigation reveals the presence of numerous recombinant haplotypes spanning the genome, specifically in the spike, nucleocapsid, and ORF 8 coding regions. These findings support the potential for recombination to reshape SARS-CoV-2 genetic diversity.
Subject(s)
COVID-19 , Superinfection , Genome, Viral/genetics , Humans , New York City/epidemiology , Recombination, Genetic , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/geneticsABSTRACT
A canine coronavirus (CCoV) has now been reported from two independent human samples from Malaysia (respiratory, collected in 2017-2018; CCoV-HuPn-2018) and Haiti (urine, collected in 2017); these two viruses were nearly genetically identical. In an effort to identify any novel adaptations associated with this apparent shift in tropism we carried out detailed evolutionary analyses of the spike gene of this virus in the context of related Alphacoronavirus 1 species. The spike 0-domain retains homology to CCoV2b (enteric infections) and Transmissible Gastroenteritis Virus (TGEV; enteric and respiratory). This domain is subject to relaxed selection pressure and an increased rate of molecular evolution. It contains unique amino acid substitutions, including within a region important for sialic acid binding and pathogenesis in TGEV. Overall, the spike gene is extensively recombinant, with a feline coronavirus type II strain serving a prominent role in the recombinant history of the virus. Molecular divergence time for a segment of the gene where temporal signal could be determined, was estimated at around 60 years ago. We hypothesize that the virus had an enteric origin, but that it may be losing that particular tropism, possibly because of mutations in the sialic acid binding region of the spike 0-domain.
Subject(s)
Coronavirus, Canine , Animals , Cats , Dogs , N-Acetylneuraminic Acid , Spike Glycoprotein, Coronavirus/genetics , Tropism , ZoonosesABSTRACT
Global genomic surveillance of SARS-CoV-2 has identified variants associated with increased transmissibility, neutralization resistance and disease severity. Here we report the emergence of the PANGO lineage C.1.2, detected at low prevalence in South Africa and eleven other countries. The initial C.1.2 detection is associated with a high substitution rate, and includes changes within the spike protein that have been associated with increased transmissibility or reduced neutralization sensitivity in SARS-CoV-2 variants of concern or variants of interest. Like Beta and Delta, C.1.2 shows significantly reduced neutralization sensitivity to plasma from vaccinees and individuals infected with the ancestral D614G virus. In contrast, convalescent donors infected with either Beta or Delta show high plasma neutralization against C.1.2. These functional data suggest that vaccine efficacy against C.1.2 will be equivalent to Beta and Delta, and that prior infection with either Beta or Delta will likely offer protection against C.1.2.
Subject(s)
COVID-19 , SARS-CoV-2 , Antibodies, Neutralizing , Antibodies, Viral , Humans , Neutralization Tests , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/geneticsABSTRACT
The SARS-CoV-2 Omicron BA.1 variant emerged in 20211 and has multiple mutations in its spike protein2. Here we show that the spike protein of Omicron has a higher affinity for ACE2 compared with Delta, and a marked change in its antigenicity increases Omicron's evasion of therapeutic monoclonal and vaccine-elicited polyclonal neutralizing antibodies after two doses. mRNA vaccination as a third vaccine dose rescues and broadens neutralization. Importantly, the antiviral drugs remdesivir and molnupiravir retain efficacy against Omicron BA.1. Replication was similar for Omicron and Delta virus isolates in human nasal epithelial cultures. However, in lung cells and gut cells, Omicron demonstrated lower replication. Omicron spike protein was less efficiently cleaved compared with Delta. The differences in replication were mapped to the entry efficiency of the virus on the basis of spike-pseudotyped virus assays. The defect in entry of Omicron pseudotyped virus to specific cell types effectively correlated with higher cellular RNA expression of TMPRSS2, and deletion of TMPRSS2 affected Delta entry to a greater extent than Omicron. Furthermore, drug inhibitors targeting specific entry pathways3 demonstrated that the Omicron spike inefficiently uses the cellular protease TMPRSS2, which promotes cell entry through plasma membrane fusion, with greater dependency on cell entry through the endocytic pathway. Consistent with suboptimal S1/S2 cleavage and inability to use TMPRSS2, syncytium formation by the Omicron spike was substantially impaired compared with the Delta spike. The less efficient spike cleavage of Omicron at S1/S2 is associated with a shift in cellular tropism away from TMPRSS2-expressing cells, with implications for altered pathogenesis.
Subject(s)
COVID-19/pathology , COVID-19/virology , Membrane Fusion , SARS-CoV-2/metabolism , SARS-CoV-2/pathogenicity , Serine Endopeptidases/metabolism , Virus Internalization , Adult , Aged , Aged, 80 and over , Angiotensin-Converting Enzyme 2/metabolism , Animals , Antibodies, Neutralizing/immunology , Antibodies, Viral/immunology , COVID-19/immunology , COVID-19 Vaccines/immunology , Cell Line , Cell Membrane/metabolism , Cell Membrane/virology , Chlorocebus aethiops , Convalescence , Female , Humans , Immune Sera/immunology , Intestines/pathology , Intestines/virology , Lung/pathology , Lung/virology , Male , Middle Aged , Mutation , Nasal Mucosa/pathology , Nasal Mucosa/virology , SARS-CoV-2/drug effects , SARS-CoV-2/immunology , Spike Glycoprotein, Coronavirus/genetics , Spike Glycoprotein, Coronavirus/metabolism , Tissue Culture Techniques , Virulence , Virus ReplicationABSTRACT
Among the 30 nonsynonymous nucleotide substitutions in the Omicron S-gene are 13 that have only rarely been seen in other SARS-CoV-2 sequences. These mutations cluster within three functionally important regions of the S-gene at sites that will likely impact (1) interactions between subunits of the Spike trimer and the predisposition of subunits to shift from down to up configurations, (2) interactions of Spike with ACE2 receptors, and (3) the priming of Spike for membrane fusion. We show here that, based on both the rarity of these 13 mutations in intrapatient sequencing reads and patterns of selection at the codon sites where the mutations occur in SARS-CoV-2 and related sarbecoviruses, prior to the emergence of Omicron the mutations would have been predicted to decrease the fitness of any virus within which they occurred. We further propose that the mutations in each of the three clusters therefore cooperatively interact to both mitigate their individual fitness costs, and, in combination with other mutations, adaptively alter the function of Spike. Given the evident epidemic growth advantages of Omicron overall previously known SARS-CoV-2 lineages, it is crucial to determine both how such complex and highly adaptive mutation constellations were assembled within the Omicron S-gene, and why, despite unprecedented global genomic surveillance efforts, the early stages of this assembly process went completely undetected.
Subject(s)
COVID-19 , Spike Glycoprotein, Coronavirus , COVID-19/genetics , Humans , Mutation , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/geneticsABSTRACT
BACKGROUND: The SARS-CoV-2 omicron (B.1.1.529) variant, which was first identified in November, 2021, spread rapidly in many countries, with a spike protein highly diverged from previously known variants, and raised concerns that this variant might evade neutralising antibody responses. We therefore aimed to characterise the sensitivity of the omicron variant to neutralisation. METHODS: For this cross-sectional study, we cloned the sequence encoding the omicron spike protein from a diagnostic sample to establish an omicron pseudotyped virus neutralisation assay. We quantified the neutralising antibody ID50 (the reciprocal dilution that produces 50% inhibition) against the omicron spike protein, and the fold-change in ID50 relative to the spike of wild-type SARS-CoV-2 (ie, the pandemic founder variant), for one convalescent reference plasma pool (WHO International Standard for anti-SARS-CoV-2 immunoglobulin [20/136]), three reference serum pools from vaccinated individuals, and two cohorts from Stockholm, Sweden: one comprising previously infected hospital workers (17 sampled in November, 2021, after vaccine rollout and nine in June or July, 2020, before vaccination) and one comprising serum from 40 randomly sampled blood donors donated during week 48 (Nov 29-Dec 5) of 2021. Furthermore, we assessed the neutralisation of omicron by five clinically relevant monoclonal antibodies (mAbs). FINDINGS: Neutralising antibody responses in reference sample pools sampled shortly after infection or vaccination were substantially less potent against the omicron variant than against wild-type SARS-CoV-2 (seven-fold to 42-fold reduction in ID50 titres). Similarly, for sera obtained before vaccination in 2020 from a cohort of convalescent hospital workers, neutralisation of the omicron variant was low to undetectable (all ID50 titres <20). However, in serum samples obtained in 2021 from two cohorts in Stockholm, substantial cross-neutralisation of the omicron variant was observed. Sera from 17 hospital workers after infection and subsequent vaccination had a reduction in average potency of only five-fold relative to wild-type SARS-CoV-2 (geometric mean ID50 titre 495 vs 105), and two donors had no reduction in potency. A similar pattern was observed in randomly sampled blood donors (n=40), who had an eight-fold reduction in average potency against the omicron variant compared with wild-type SARS-CoV-2 (geometric mean ID50 titre 369 vs 45). We found that the omicron variant was resistant to neutralisation (50% inhibitory concentration [IC50] >10 µg/mL) by mAbs casirivimab (REGN-10933), imdevimab (REGN-10987), etesevimab (Ly-CoV016), and bamlanivimab (Ly-CoV555), which form part of antibody combinations used in the clinic to treat COVID-19. However, S309, the parent of sotrovimab, retained most of its activity, with only an approximately two-fold reduction in potency against the omicron variant compared with ancestral D614G SARS-CoV-2 (IC50 0·1-0·2 µg/mL). INTERPRETATION: These data highlight the extensive, but incomplete, evasion of neutralising antibody responses by the omicron variant, and suggest that boosting with licensed vaccines might be sufficient to raise neutralising antibody titres to protective levels. FUNDING: European Union Horizon 2020 research and innovation programme, European and Developing Countries Clinical Trials Partnership, SciLifeLab, and the Erling-Persson Foundation.
Subject(s)
COVID-19 , SARS-CoV-2 , Antibodies, Monoclonal , Antibodies, Monoclonal, Humanized , Antibodies, Neutralizing , Antibodies, Viral , COVID-19/epidemiology , COVID-19 Vaccines , Cross-Sectional Studies , Humans , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/geneticsABSTRACT
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is adaptively evolving to ensure its persistence within human hosts. It is therefore necessary to continuously monitor the emergence and prevalence of novel variants that arise. Importantly, some mutations have been associated with both molecular diagnostic failures and reduced or abrogated next-generation sequencing (NGS) read coverage in some genomic regions. Such impacts are particularly problematic when they occur in genomic regions such as those that encode the spike (S) protein, which are crucial for identifying and tracking the prevalence and dissemination dynamics of concerning viral variants. Targeted Sanger sequencing presents a fast and cost-effective means to accurately extend the coverage of whole-genome sequences. We designed a custom set of primers to amplify a 401 bp segment of the receptor-binding domain (RBD) (between positions 22698 and 23098 relative to the Wuhan-Hu-1 reference). We then designed a Sanger sequencing wet-laboratory protocol. We applied the primer set and wet-laboratory protocol to sequence 222 samples that were missing positions with key mutations K417N, E484K, and N501Y due to poor coverage after NGS sequencing. Finally, we developed SeqPatcher, a Python-based computational tool to analyse the trace files yielded by Sanger sequencing to generate consensus sequences, or take preanalysed consensus sequences in fasta format, and merge them with their corresponding whole-genome assemblies. We successfully sequenced 153 samples of 222 (69â%) using Sanger sequencing and confirmed the occurrence of key beta variant mutations (K417N, E484K, N501Y) in the S genes of 142 of 153 (93â%) samples. Additionally, one sample had the Y508F mutation and four samples the S477N. Samples with RT-PCR Ct scores ranging from 13.85 to 37.47 (mean=25.70) could be Sanger sequenced efficiently. These results show that our method and pipeline can be used to improve the quality of whole-genome assemblies produced using NGS and can be used with any pairs of the most used NGS and Sanger sequencing platforms.
Subject(s)
Genome, Viral , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing , MutationABSTRACT
The lack of an identifiable intermediate host species for the proximal animal ancestor of SARS-CoV-2, and the large geographical distance between Wuhan and where the closest evolutionary related coronaviruses circulating in horseshoe bats (members of the Sarbecovirus subgenus) have been identified, is fueling speculation on the natural origins of SARS-CoV-2. We performed a comprehensive phylogenetic study on SARS-CoV-2 and all the related bat and pangolin sarbecoviruses sampled so far. Determining the likely recombination events reveals a highly reticulate evolutionary history within this group of coronaviruses. Distribution of the inferred recombination events is nonrandom with evidence that Spike, the main target for humoral immunity, is beside a recombination hotspot likely driving antigenic shift events in the ancestry of bat sarbecoviruses. Coupled with the geographic ranges of their hosts and the sampling locations, across southern China, and into Southeast Asia, we confirm that horseshoe bats, Rhinolophus, are the likely reservoir species for the SARS-CoV-2 progenitor. By tracing the recombinant sequence patterns, we conclude that there has been relatively recent geographic movement and cocirculation of these viruses' ancestors, extending across their bat host ranges in China and Southeast Asia over the last 100 years. We confirm that a direct proximal ancestor to SARS-CoV-2 has not yet been sampled, since the closest known relatives collected in Yunnan shared a common ancestor with SARS-CoV-2 approximately 40 years ago. Our analysis highlights the need for dramatically more wildlife sampling to: 1) pinpoint the exact origins of SARS-CoV-2's animal progenitor, 2) the intermediate species that facilitated transmission from bats to humans (if there is one), and 3) survey the extent of the diversity in the related sarbecoviruses' phylogeny that present high risk for future spillovers.
Subject(s)
Chiroptera/virology , Coronavirus/genetics , Pangolins/virology , Phylogeny , Recombination, Genetic , Animals , Humans , PhylogeographyABSTRACT
The progression of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic in Africa has so far been heterogeneous, and the full impact is not yet well understood. In this study, we describe the genomic epidemiology using a dataset of 8746 genomes from 33 African countries and two overseas territories. We show that the epidemics in most countries were initiated by importations predominantly from Europe, which diminished after the early introduction of international travel restrictions. As the pandemic progressed, ongoing transmission in many countries and increasing mobility led to the emergence and spread within the continent of many variants of concern and interest, such as B.1.351, B.1.525, A.23.1, and C.1.1. Although distorted by low sampling numbers and blind spots, the findings highlight that Africa must not be left behind in the global pandemic response, otherwise it could become a source for new variants.
Subject(s)
COVID-19/epidemiology , Epidemiological Monitoring , Genomics , Pandemics , SARS-CoV-2/genetics , Africa/epidemiology , COVID-19/transmission , COVID-19/virology , Genetic Variation , Humans , SARS-CoV-2/isolation & purificationABSTRACT
The independent emergence late in 2020 of the B.1.1.7, B.1.351, and P.1 lineages of SARS-CoV-2 prompted renewed concerns about the evolutionary capacity of this virus to overcome public health interventions and rising population immunity. Here, by examining patterns of synonymous and non-synonymous mutations that have accumulated in SARS-CoV-2 genomes since the pandemic began, we find that the emergence of these three "501Y lineages" coincided with a major global shift in the selective forces acting on various SARS-CoV-2 genes. Following their emergence, the adaptive evolution of 501Y lineage viruses has involved repeated selectively favored convergent mutations at 35 genome sites, mutations we refer to as the 501Y meta-signature. The ongoing convergence of viruses in many other lineages on this meta-signature suggests that it includes multiple mutation combinations capable of promoting the persistence of diverse SARS-CoV-2 lineages in the face of mounting host immune recognition.
Subject(s)
COVID-19/epidemiology , Evolution, Molecular , Mutation , Pandemics , SARS-CoV-2/genetics , Amino Acid Sequence/genetics , COVID-19/immunology , COVID-19/transmission , COVID-19/virology , Codon/genetics , Genes, Viral , Genetic Drift , Host Adaptation/genetics , Humans , Immune Evasion , Phylogeny , Public HealthABSTRACT
The first severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in South Africa was identified on 5 March 2020, and by 26 March the country was in full lockdown (Oxford stringency index of 90)1. Despite the early response, by November 2020, over 785,000 people in South Africa were infected, which accounted for approximately 50% of all known African infections2. In this study, we analyzed 1,365 near whole genomes and report the identification of 16 new lineages of SARS-CoV-2 isolated between 6 March and 26 August 2020. Most of these lineages have unique mutations that have not been identified elsewhere. We also show that three lineages (B.1.1.54, B.1.1.56 and C.1) spread widely in South Africa during the first wave, comprising ~42% of all infections in the country at the time. The newly identified C lineage of SARS-CoV-2, C.1, which has 16 nucleotide mutations as compared with the original Wuhan sequence, including one amino acid change on the spike protein, D614G (ref. 3), was the most geographically widespread lineage in South Africa by the end of August 2020. An early South African-specific lineage, B.1.106, which was identified in April 2020 (ref. 4), became extinct after nosocomial outbreaks were controlled in KwaZulu-Natal Province. Our findings show that genomic surveillance can be implemented on a large scale in Africa to identify new lineages and inform measures to control the spread of SARS-CoV-2. Such genomic surveillance presented in this study has been shown to be crucial in the identification of the 501Y.V2 variant in South Africa in December 2020 (ref. 5).
Subject(s)
COVID-19/epidemiology , COVID-19/virology , SARS-CoV-2/genetics , Datasets as Topic , Genome, Viral , Humans , Molecular Typing , Mutation , Pandemics , Phylogeny , Phylogeography , Real-Time Polymerase Chain Reaction , SARS-CoV-2/classification , SARS-CoV-2/isolation & purification , Sequence Analysis, RNA , South Africa/epidemiology , Whole Genome SequencingABSTRACT
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes acute, highly transmissible respiratory infection in humans and a wide range of animal species. Its rapid global spread has resulted in a major public health emergency, necessitating commensurately rapid research to improve control strategies. In particular, the ability to effectively retrace transmission chains in outbreaks remains a major challenge, partly due to our limited understanding of the virus' underlying evolutionary dynamics within and between hosts. We used high-throughput sequencing whole-genome data coupled with bottleneck analysis to retrace the pathways of viral transmission in two nosocomial outbreaks that were previously characterised by epidemiological and phylogenetic methods. Additionally, we assessed the mutational landscape, selection pressures, and diversity at the within-host level for both outbreaks. Our findings show evidence of within-host selection and transmission of variants between samples. Both bottleneck and diversity analyses highlight within-host and consensus-level variants shared by putative source-recipient pairs in both outbreaks, suggesting that certain within-host variants in these outbreaks may have been transmitted upon infection rather than arising de novo independently within multiple hosts. Overall, our findings demonstrate the utility of combining within-host diversity and bottleneck estimations for elucidating transmission events in SARS-CoV-2 outbreaks, provide insight into the maintenance of viral genetic diversity, provide a list of candidate targets of positive selection for further investigation, and demonstrate that within-host variants can be transferred between patients. Together these results will help in developing strategies to understand the nature of transmission events and curtail the spread of SARS-CoV-2.