The SARS-CoV-2 has infected almost 200 million people worldwide by July 2021 and the pandemic has been characterized by infection waves of viral lineages showing distinct fitness profiles. The simultaneous infection of a single individual by two distinct SARS-CoV-2 lineages provides a window of opportunity for viral recombination and the emergence of new lineages with differential phenotype. Several hundred SARS-CoV-2 lineages are currently well characterized but two main factors have precluded major coinfection/codetection analysis thus far: i) the low diversity of SARS-CoV-2 lineages during the first year of the pandemic which limited the identification of lineage defining mutations necessary to distinguish coinfecting viral lineages; and the ii) limited availability of raw sequencing data where abundance and distribution of intrasample/intrahost variability can be accessed. Here, we have put together a large sequencing dataset from Brazilian samples covering a period of 18 May 2020 to 30 April 2021 and probed it for unexpected patterns of high intrasample/intrahost variability. It enabled us to detect nine cases of SARS-CoV-2 coinfection with well characterized lineage-defining mutations. In addition, we matched these SARS-CoV-2 coinfections with spatio-temporal epidemiological data confirming their plausibility with the co-circulating lineages at the timeframe investigated. These coinfections represent around 0.61% of all samples investigated. Although our data suggests that coinfection with distinct SARS-CoV-2 lineages is a rare phenomenon, it is likely an underestimation and coinfection rates warrants further investigation. DATA SUMMARYThe raw fastq data of codetection cases are deposited on and correlated to gisaid codes: EPI_ISL_1068258, EPI_ISL_2491769, EPI_ISL_2491781, EPI_ISL_2645599, EPI_ISL_2661789, EPI_ISL_2661931, EPI_ISL_2677092, EPI_ISL_2777552, EPI_ISL_3869215. Supplementary data are available on The workflow code used in this study is publicly available on:

Mutations at both the receptor-binding domain (RBD) and the amino (N)-terminal domain (NTD) of the SARS-CoV-2 Spike (S) glycoprotein can alter its antigenicity and promote immune escape. We identified that SARS-CoV-2 lineages circulating in Brazil with mutations of concern in the RBD independently acquired convergent deletions and insertions in the NTD of the S protein, which altered the NTD antigenic-supersite and other predicted epitopes at this region. These findings support that the ongoing widespread transmission of SARS-CoV-2 in Brazil is generating new viral lineages that might be more resistant to neutralization than parental variants of concern.

