ABSTRACT
There are outstanding evolutionary questions on the recent emergence of human coronavirus SARS-CoV-2 including the role of reservoir species, the role of recombination and its time of divergence from animal viruses. We find that the sarbecoviruses-the viral subgenus containing SARS-CoV and SARS-CoV-2-undergo frequent recombination and exhibit spatially structured genetic diversity on a regional scale in China. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. To employ phylogenetic dating methods, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were shown to be consistent for these three approaches and for two different prior specifications of evolutionary rates based on HCoV-OC43 and MERS-CoV. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% highest posterior density (HPD): 1879-1999), 1969 (95% HPD: 1930-2000) and 1982 (95% HPD: 1948-2009), indicating that the lineage giving rise to SARS-CoV-2 has been circulating unnoticed in bats for decades.
Subject(s)
Betacoronavirus/genetics , Coronavirus Infections/epidemiology , Coronavirus Infections/virology , Pneumonia, Viral/epidemiology , Pneumonia, Viral/virology , Angiotensin-Converting Enzyme 2 , Animals , Bayes Theorem , Betacoronavirus/metabolism , COVID-19 , China/epidemiology , Chiroptera/virology , Coronavirus Infections/metabolism , Evolution, Molecular , Genetic Variation , Genome, Viral , Humans , Pandemics , Peptidyl-Dipeptidase A/metabolism , Phylogeny , Pneumonia, Viral/metabolism , Recombination, Genetic , SARS-CoV-2ABSTRACT
Due to the scope and impact of the COVID-19 pandemic there exists a strong desire to understand where the SARS-CoV-2 virus came from and how it jumped species boundaries to humans. Molecular evolutionary analyses can trace viral origins by establishing relatedness and divergence times of viruses and identifying past selective pressures. However, we must uphold rigorous standards of inference and interpretation on this topic because of the ramifications of being wrong. Here, we dispute the conclusions of Xia (2020. Extreme genomic CpG deficiency in SARS-CoV-2 and evasion of host antiviral defense. Mol Biol Evol. doi:10.1093/molbev/masa095) that dogs are a likely intermediate host of a SARS-CoV-2 ancestor. We highlight major flaws in Xia's inference process and his analysis of CpG deficiencies, and conclude that there is no direct evidence for the role of dogs as intermediate hosts. Bats and pangolins currently have the greatest support as ancestral hosts of SARS-CoV-2, with the strong caveat that sampling of wildlife species for coronaviruses has been limited.