Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 1.573
Filter
2.
Viruses ; 14(9)2022 09 14.
Article in English | MEDLINE | ID: covidwho-2143631

ABSTRACT

In this retrospective, single-center study, we conducted an analysis of 13,699 samples from different individuals obtained from the Federal Research Center of Fundamental and Translational Medicine, from 1 April to 30 May 2020 in Novosibirsk region (population 2.8 million people). We identified 6.49% positive for SARS-CoV-2 cases out of the total number of diagnostic tests, and 42% of them were from asymptomatic people. We also detected two asymptomatic people, who had no confirmed contact with patients with COVID-19. The highest percentage of positive samples was observed in the 80+ group (16.3%), while among the children and adults it did not exceed 8%. Among all the people tested, 2423 came from a total of 80 different destinations and only 27 of them were positive for SARS-CoV-2. Out of all the positive samples, 15 were taken for SARS-CoV-2 sequencing. According to the analysis of the genome sequences, the SARS-CoV-2 variants isolated in the Novosibirsk region at the beginning of the pandemic belonged to three phylogenetic lineages according to the Pangolin classification: B.1, B.1.1, and B.1.1.129. All Novosibirsk isolates contained the D614G substitution in the Spike protein, two isolates werecharacterized by an additional M153T mutation, and one isolate wascharacterized by the L5F mutation.


Subject(s)
COVID-19 , SARS-CoV-2 , Adult , COVID-19/epidemiology , Child , Genome, Viral , Genomics , Humans , Mutation , Pandemics , Phylogeny , Retrospective Studies , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/genetics
3.
Viruses ; 14(11)2022 Nov 21.
Article in English | MEDLINE | ID: covidwho-2123865

ABSTRACT

A considerable number of new SARS-CoV-2 lineages have emerged since the first COVID-19 cases were reported in Wuhan. As a few variants showed higher COVID-19 disease transmissibility and the ability to escape from immune responses, surveillance became relevant at that time. Single-nucleotide mutation PCR-based protocols were not always specific, and consequently, determination of a high number of informative sites was needed for accurate lineage identification. A detailed in silico analysis of SARS-CoV-2 sequences retrieved from GISAID database revealed the S gene 921 bp-fragment, positions 22784-23705 of SARS-CoV-2 reference genome, as the most informative fragment (30 variable sites) to determine relevant SARS-CoV-2 variants. Consequently, a method consisting of the PCR-amplification of this fragment, followed by Sanger's sequencing and a "single-click" informatic program based on a reference database, was developed and validated. PCR-fragments obtained from clinical SARS-CoV-2 samples were compared with homologous variant-sequences and the resulting phylogenetic tree allowed the identification of Alpha, Delta, Omicron, Beta, Gamma, and other variants. The data analysis procedure was automatized and simplified to the point that it did not require specific technical skills. The method is faster and cheaper than current whole-genome sequencing methods; it is available worldwide, and it may help to enhance efficient surveillance in the fight against the COVID-19 pandemic.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Phylogeny , Genome, Viral , COVID-19/diagnosis , COVID-19/epidemiology , Pandemics , Polymerase Chain Reaction
4.
Saudi Med J ; 43(11): 1276-1279, 2022 Nov.
Article in English | MEDLINE | ID: covidwho-2119223

ABSTRACT

OBJECTIVES: To investigate the emergent mutations involved in the evolutionary stages of the virus for better management of pandemic. METHODS: This cross-sectional genomic investigation was performed on February 28, 2022, at the Biology Department, Faculty of Science, Tabuk University. Numerous mutations were searched in genomic isolates of Omicron variant prevalent in the Kingdom of Saudi Arabian. Whole-genome sequences were retrieved from genomic databases and were subjected to the Global Initiative on Sharing Avian Influenza Data (GISAID) CoVsurver for the Omicron variant detection and mutations. RESULTS: Approximately 8.755 million SARS-CoV-2 genomes were reported to GISAID on February 28, 2022, of which 1270 have been reported from the Kingdom of Saudi Arabia. Among the 1270 genomes, 30 were Omicron variants. Among the Saudi Arabian genomes, 30 were detected as Omicron variants. Twenty-four unique mutations have been detected in membrane, envelope, spike and non-structural proteins (NSP) 12, NSP3, and NSP2. Ten of these unique mutations have been detected in spike protein. CONCLUSION: The current study provides useful information for further experimental investigation of mutation's effects on virus transmission, severity, and vaccine efficacy.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Animals , SARS-CoV-2/genetics , Saudi Arabia/epidemiology , Genome, Viral , Cross-Sectional Studies , COVID-19/epidemiology , Mutation , Genomics
5.
Sci Rep ; 12(1): 19416, 2022 Nov 12.
Article in English | MEDLINE | ID: covidwho-2119170

ABSTRACT

The current COVID-19 pandemic outbreak poses a serious threat to public health, demonstrating the critical need for the development of effective and reproducible detection tests. Since the RT-qPCR primers are highly specific and can only be designed based on the known sequence, mutation sensitivity is its limitation. Moreover, the mutations in the severe acute respiratory syndrome ß-coronavirus (SARS-CoV-2) genome led to new highly transmissible variants such as Delta and Omicron variants. In the case of mutation, RT-qPCR primers cannot recognize and attach to the target sequence. This research presents an accurate dual-platform DNA biosensor based on the colorimetric assay of gold nanoparticles and the surface-enhanced Raman scattering (SERS) technique. It simultaneously targets four different regions of the viral genome for detection of SARS-CoV-2 and its new variants prior to any sequencing. Hence, in the case of mutation in one of the target sequences, the other three probes could detect the SARS-CoV-2 genome. The method is based on visible biosensor color shift and a locally enhanced electromagnetic field and significantly amplified SERS signal due to the proximity of Sulfo-Cyanine 3 (Cy3) and AuNPs intensity peak at 1468 cm-1. The dual-platform DNA/GO/AuNP biosensor exhibits high sensitivity toward the viral genome with a LOD of 0.16 ng/µL. This is a safe point-of-care, naked-eye, equipment-free, and rapid (10 min) detection biosensor for diagnosing COVID-19 cases at home using a nasopharyngeal sample.


Subject(s)
Biosensing Techniques , COVID-19 , Metal Nanoparticles , Humans , SARS-CoV-2/genetics , Gold , Pandemics , COVID-19/diagnosis , Biosensing Techniques/methods , Genome, Viral/genetics , DNA , RNA, Viral/genetics
6.
Sci Rep ; 12(1): 19274, 2022 Nov 11.
Article in English | MEDLINE | ID: covidwho-2118834

ABSTRACT

Since the beginning of the SARS-CoV-2 coronavirus pandemic, genome sequencing is essential to monitor viral mutations over time and by territory. This need for complete genetic information is further reinforced by the rapid spread of variants of concern. In this paper, we assess the ability of the hybridization technique, Capture-Seq, to detect the SARS-CoV-2 genome, either partially or in its integrity on patients samples. We studied 20 patient nasal swab samples broken down into five series of four samples of equivalent viral load from CT25 to CT36+ . For this, we tested 3 multi-virus panel as well as 2 SARS-CoV-2 only panels. The panels were chosen based on their specificity, global or specific, as well as their technological difference in the composition of the probes: ssRNA, ssDNA and dsDNA. The multi-virus panels are able to capture high-abundance targets but fail to capture the lowest-abundance targets, with a high percentage of off-target reads corresponding to the abundance of the host sequences. Both SARS-CoV-2-only panels were very effective, with high percentage of reads corresponding to the target. Overall, capture followed by sequencing is very effective for the study of SARS-CoV-2 in low-abundance patient samples and is suitable for samples with CT values up to 35.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/diagnosis , Pandemics , COVID-19 Testing , Base Sequence , Genome, Viral
7.
Probl Sotsialnoi Gig Zdravookhranenniiai Istor Med ; 30(s1): 1061-1066, 2022 Dec 15.
Article in Russian | MEDLINE | ID: covidwho-2117183

ABSTRACT

An important goal of COVID-19 surveillance is to detect outbreaks using modern molecular epidemiology techniques based on methods to decode the full genome of the virus, since rapidly evolving RNA viruses, which include SARS-CoV-2, are constantly accumulating changes in their genomes. In addition to using these changes to identify the different virus lines spreading in the population, the availability of sequence information is very important. It will allow the identification of altered variants that may be more transmissible, cause more severe forms of disease, or be undetectable by existing diagnostic test systems. The global scientific community is particularly interested in changes in the spike protein (S-protein, Spike) because they are responsible for binding and penetration into the host cell, lead to false-negative results in diagnostic tests, and affect transmission rates, health outcomes, therapeutic interventions, and vaccine efficacy.Genomic surveillance uses next-generation sequencing (NGS) applications and makes data on the full genome of the virus available. These methods offer new means to detect variants that differ phenotypically or antigenically. This approach promotes earlier prediction as well as effective strategies to mitigate and contain outbreaks of SARS-CoV-2 and other new viruses long before they spread worldwide.Today, molecular typing of strains is playing an increasingly important role in this process, as it makes it possible to identify samples that share a common molecular «fingerprint¼.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Genome, Viral , Phylogeny , Moscow/epidemiology , COVID-19/diagnosis , COVID-19/epidemiology , Genomics
8.
Nat Commun ; 13(1): 7003, 2022 Nov 16.
Article in English | MEDLINE | ID: covidwho-2116500

ABSTRACT

Genomic sequencing is essential to track the evolution and spread of SARS-CoV-2, optimize molecular tests, treatments, vaccines, and guide public health responses. To investigate the global SARS-CoV-2 genomic surveillance, we used sequences shared via GISAID to estimate the impact of sequencing intensity and turnaround times on variant detection in 189 countries. In the first two years of the pandemic, 78% of high-income countries sequenced >0.5% of their COVID-19 cases, while 42% of low- and middle-income countries reached that mark. Around 25% of the genomes from high income countries were submitted within 21 days, a pattern observed in 5% of the genomes from low- and middle-income countries. We found that sequencing around 0.5% of the cases, with a turnaround time <21 days, could provide a benchmark for SARS-CoV-2 genomic surveillance. Socioeconomic inequalities undermine the global pandemic preparedness, and efforts must be made to support low- and middle-income countries improve their local sequencing capacity.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Genome, Viral/genetics , COVID-19/epidemiology , Pandemics , Genomics
9.
Viruses ; 14(11)2022 Nov 19.
Article in English | MEDLINE | ID: covidwho-2116191

ABSTRACT

Infectious Bronchitis (IB) is a respiratory disease caused by a highly variable Gammacoronavirus, which generates a negative impact on poultry health worldwide. GI-11 and GI-16 lineages have been identified in South America based on Infectious Bronchitis virus (IBV) partial S1 sequences. However, full genome sequence information is limited. In this study we report, for the first time, the whole-genome sequence of IBV from Colombia. Seven IBV isolates obtained during 2012 and 2013 from farms with respiratory disease compatible with IB were selected and the complete genome sequence was obtained by NGS. According to S1 sequence phylogenetic analysis, six isolates belong to lineage GI-1 and one to lineage GVI-1. When whole genome was analyzed, five isolates were related to the vaccine strain Ma5 2016 and two showed mosaic genomes. Results from complete S1 sequence analysis provides further support for the hypothesis that GVI-1, considered a geographically confined lineage in Asia, could have originated in Colombia. Complete genome information reported in this research allow a deeper understanding of the phylogenetic evolution of variants and the recombination events between strains that are circulating worldwide, contributing to the knowledge of coronavirus in Latin America and the world.


Subject(s)
Infectious bronchitis virus , Poultry Diseases , Animals , Phylogeny , Colombia/epidemiology , Poultry Diseases/prevention & control , Chickens , Genome, Viral
10.
Genes (Basel) ; 13(11)2022 Nov 18.
Article in English | MEDLINE | ID: covidwho-2115985

ABSTRACT

The COVID-19 pandemic initiated a race to determine the best measures to control the disease and to save as many people as possible. Efforts to implement social distancing, the use of masks, and massive vaccination programs turned out to be essential in reducing the devastating effects of the pandemic. Nevertheless, the high mutation rates of SARS-CoV-2 challenge the vaccination strategy and maintain the threat of new outbreaks due to the risk of infection surges and even lethal variations able to resist the effects of vaccines and upset the balance. Most of the new therapies tested against SARS-CoV-2 came from already available formulations developed to treat other diseases, so they were not specifically developed for SARS-CoV-2. In parallel, the knowledge produced regarding the molecular mechanisms involved in this disease was vast due to massive efforts worldwide. Taking advantage of such a vast molecular understanding of virus genomes and disease mechanisms, a targeted molecular therapy based on siRNA specifically developed to reach exclusive SARS-CoV-2 genomic sequences was tested in a non-transformed human cell model. Since coronavirus can escape from siRNA by producing siRNA inhibitors, a complex strategy to simultaneously strike both the viral infectious mechanism and the capability of evading siRNA therapy was developed. The combined administration of the chosen produced siRNA proved to be highly effective in successfully reducing viral load and keeping virus replication under control, even after many days of treatment, unlike the combinations of siRNAs lacking this anti-anti-siRNA capability. Additionally, the developed therapy did not harm the normal cells, which was demonstrated because, instead of testing the siRNA in nonhuman cells or in transformed human cells, a non-transformed human thyroid cell was specifically chosen for the experiment. The proposed siRNA combination could reduce the viral load and allow the cellular recovery, presenting a potential innovation for consideration as an additional strategy to counter or cope COVID-19.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Pandemics , Virus Replication/genetics , Genome, Viral , RNA, Small Interfering/genetics
11.
Proc Biol Sci ; 289(1987): 20221747, 2022 Nov 30.
Article in English | MEDLINE | ID: covidwho-2115857

ABSTRACT

The raw material for viral evolution is provided by intra-host mutations occurring during replication, transcription or post-transcription. Replication and transcription of Coronaviridae proceed through the synthesis of negative-sense 'antigenomes' acting as templates for positive-sense genomic and subgenomic RNA. Hence, mutations in the genomes of SARS-CoV-2 and other coronaviruses can occur during (and after) the synthesis of either negative-sense or positive-sense RNA, with potentially distinct patterns and consequences. We explored for the first time the mutational spectrum of SARS-CoV-2 (sub)genomic and anti(sub)genomic RNA. We use a high-quality deep sequencing dataset produced using a quantitative strand-aware sequencing method, controlled for artefacts and sequencing errors, and scrutinized for accurate detection of within-host diversity. The nucleotide differences between negative- and positive-sense strand consensus vary between patients and do not show dependence on age or sex. Similarities and differences in mutational patterns between within-host minor variants on the two RNA strands suggested strand-specific mutations or editing by host deaminases and oxidative damage. We observe generally neutral and slight negative selection on the negative strand, contrasting with purifying selection in ORF1a, ORF1b and S genes of the positive strand of the genome.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , RNA, Viral/genetics , Genome, Viral , Mutation , Genomics
12.
Microbiol Spectr ; 10(2): e0224021, 2022 04 27.
Article in English | MEDLINE | ID: covidwho-2115551

ABSTRACT

During the coronavirus disease 2019 (COVID-19) pandemic, the emergence and rapid increase of the B.1.1.7 (Alpha) lineage of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), first identified in the United Kingdom in September 2020, was well documented in different areas of the world and became a global public health concern because of its increased transmissibility. The B.1.1.7 lineage was first detected in Mexico during December 2020, showing a slow progressive increase in its circulation frequency, which reached its maximum in May 2021 but never became predominant. In this work, we analyzed the patterns of diversity and distribution of this lineage in Mexico using phylogenetic and haplotype network analyses. Despite the reported increase in transmissibility of the B.1.1.7 lineage, in most Mexican states, it did not displace cocirculating lineages, such as B.1.1.519, which dominated the country from February to May 2021. Our results show that the states with the highest prevalence of B.1.1.7 were those at the Mexico-U.S. border. An apparent pattern of dispersion of this lineage from the northern states of Mexico toward the center or the southeast was observed in the largest transmission chains, indicating possible independent introduction events from the United States. However, other entry points cannot be excluded, as shown by multiple introduction events. Local transmission led to a few successful haplotypes with a localized distribution and specific mutations indicating sustained community transmission. IMPORTANCE The emergence and rapid increase of the B.1.1.7 (Alpha) lineage of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) throughout the world were due to its increased transmissibility. However, it did not displace cocirculating lineages in most of Mexico, particularly B.1.1.519, which dominated the country from February to May 2021. In this work, we analyzed the distribution of B.1.1.7 in Mexico using phylogenetic and haplotype network analyses. Our results show that the states with the highest prevalence of B.1.1.7 (around 30%) were those at the Mexico-U.S. border, which also exhibited the highest lineage diversity, indicating possible introduction events from the United States. Also, several haplotypes were identified with a localized distribution and specific mutations, indicating that sustained community transmission occurred in the country.


Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , Genome, Viral , Humans , Mexico/epidemiology , Phylogeny , SARS-CoV-2/genetics
13.
Elife ; 112022 11 08.
Article in English | MEDLINE | ID: covidwho-2110897

ABSTRACT

Public health emergencies like SARS, MERS, and COVID-19 have prioritized surveillance of zoonotic coronaviruses, resulting in extensive genomic characterization of coronavirus diversity in bats. Sequencing viral genomes directly from animal specimens remains a laboratory challenge, however, and most bat coronaviruses have been characterized solely by PCR amplification of small regions from the best-conserved gene. This has resulted in limited phylogenetic resolution and left viral genetic factors relevant to threat assessment undescribed. In this study, we evaluated whether a technique called hybridization probe capture can achieve more extensive genome recovery from surveillance specimens. Using a custom panel of 20,000 probes, we captured and sequenced coronavirus genomic material in 21 swab specimens collected from bats in the Democratic Republic of the Congo. For 15 of these specimens, probe capture recovered more genome sequence than had been previously generated with standard amplicon sequencing protocols, providing a median 6.1-fold improvement (ranging up to 69.1-fold). Probe capture data also identified five novel alpha- and betacoronaviruses in these specimens, and their full genomes were recovered with additional deep sequencing. Based on these experiences, we discuss how probe capture could be effectively operationalized alongside other sequencing technologies for high-throughput, genomics-based discovery and surveillance of bat coronaviruses.


Subject(s)
COVID-19 , Chiroptera , Animals , Phylogeny , Genetic Variation , Sequence Analysis, DNA , Genome, Viral/genetics , High-Throughput Nucleotide Sequencing , Genomics
14.
Viruses ; 14(11)2022 Nov 11.
Article in English | MEDLINE | ID: covidwho-2110276

ABSTRACT

SARS-CoV-2 virus pathogenicity and transmissibility are correlated with the mutations acquired over time, giving rise to variants of concern (VOCs). Mutations can significantly influence the genetic make-up of the virus. Herein, we analyzed the SARS-CoV-2 genomes and sub-genomic nucleotide composition in relation to the mutation rate. Nucleotide percentage distributions of 1397 in-house-sequenced SARS-CoV-2 genomes were enumerated, and comparative analyses (i) within the VOCs and of (ii) recovered and mortality patients were performed. Fisher's test was carried out to highlight the significant mutations, followed by RNA secondary structure prediction and protein modeling for their functional impacts. Subsequently, a uniform dinucleotide composition of AT and GC was found across study cohorts. Notably, the N gene was observed to have a high GC percentage coupled with a relatively higher mutation rate. Functional analysis demonstrated the N gene mutations, C29144T and G29332T, to induce structural changes at the RNA level. Protein secondary structure prediction with N gene missense mutations revealed a differential composition of alpha helices, beta sheets, and coils, whereas the tertiary structure displayed no significant changes. Additionally, the N gene CTD region displayed no mutations. The analysis highlighted the importance of N protein in viral evolution with CTD as a possible target for antiviral drugs.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Mutation Rate , Nucleotides , Genome, Viral , RNA
15.
Proc Biol Sci ; 289(1986): 20221437, 2022 11 09.
Article in English | MEDLINE | ID: covidwho-2107716

ABSTRACT

The repeated emergence of SARS-CoV-2 escape mutants from host immunity has obstructed the containment of the current pandemic and poses a serious threat to humanity. Prolonged infection in immunocompromised patients has received increasing attention as a driver of immune escape, and accumulating evidence suggests that viral genomic diversity and emergence of immune-escape mutants are promoted in immunocompromised patients. However, because immunocompromised patients comprise a small proportion of the host population, whether they have a significant impact on antigenic evolution at the population level is unknown. We consider an evolutionary epidemiological model that combines antigenic evolution and epidemiological dynamics. Applying this model to a heterogeneous host population, we study the impact of immunocompromised hosts on the evolutionary dynamics of pathogen antigenic escape from host immunity. We derived analytical formulae of the speed of antigenic evolution in heterogeneous host populations and found that even a small number of immunocompromised hosts in the population significantly accelerates antigenic evolution. Our results demonstrate that immunocompromised hosts play a key role in viral adaptation at the population level and emphasize the importance of critical care and surveillance of immunocompromised hosts.


Subject(s)
Antigenic Drift and Shift , COVID-19 , Humans , SARS-CoV-2 , Genome, Viral , Immunocompromised Host
16.
Viruses ; 14(11)2022 Nov 02.
Article in English | MEDLINE | ID: covidwho-2099857

ABSTRACT

To explore a genomic pool of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) during the pandemic, the Ministry of Health of the Slovak Republic formed a genomics surveillance workgroup, and the Public Health Authority of the Slovak Republic launched a systematic national epidemiological surveillance using whole-genome sequencing (WGS). Six out of seven genomic centers implementing Illumina sequencing technology were involved in the national SARS-CoV-2 virus sequencing program. Here we analyze a total of 33,024 SARS-CoV-2 isolates collected from the Slovak population from 1 March 2021, to 31 March 2022, that were sequenced and analyzed in a consistent manner. Overall, 28,005 out of 30,793 successfully sequenced samples met the criteria to be deposited in the global GISAID database. During this period, we identified four variants of concern (VOC)-Alpha (B.1.1.7), Beta (B.1.351), Delta (B.1.617.2) and Omicron (B.1.1.529). In detail, we observed 165 lineages in our dataset, with dominating Alpha, Delta and Omicron in three major consecutive incidence waves. This study aims to describe the results of a routine but high-level SARS-CoV-2 genomic surveillance program. Our study of SARS-CoV-2 genomes in collaboration with the Public Health Authority of the Slovak Republic also helped to inform the public about the epidemiological situation during the pandemic.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Slovakia/epidemiology , COVID-19/epidemiology , Genome, Viral , High-Throughput Nucleotide Sequencing , Genomics
17.
BMC Bioinformatics ; 22(Suppl 15): 625, 2022 Apr 19.
Article in English | MEDLINE | ID: covidwho-1798449

ABSTRACT

BACKGROUND: Being able to efficiently call variants from the increasing amount of sequencing data daily produced from multiple viral strains is of the utmost importance, as demonstrated during the COVID-19 pandemic, in order to track the spread of the viral strains across the globe. RESULTS: We present MALVIRUS, an easy-to-install and easy-to-use application that assists users in multiple tasks required for the analysis of a viral population, such as the SARS-CoV-2. MALVIRUS allows to: (1) construct a variant catalog consisting in a set of variations (SNPs/indels) from the population sequences, (2) efficiently genotype and annotate variants of the catalog supported by a read sample, and (3) when the considered viral species is the SARS-CoV-2, assign the input sample to the most likely Pango lineages using the genotyped variations. CONCLUSIONS: Tests on Illumina and Nanopore samples proved the efficiency and the effectiveness of MALVIRUS in analyzing SARS-CoV-2 strain samples with respect to publicly available data provided by NCBI and the more complete dataset provided by GISAID. A comparison with state-of-the-art tools showed that MALVIRUS is always more precise and often have a better recall.


Subject(s)
COVID-19 , Genome, Viral , High-Throughput Nucleotide Sequencing , Humans , Mutation , Pandemics , Phylogeny , SARS-CoV-2/genetics
18.
PLoS One ; 17(11): e0275623, 2022.
Article in English | MEDLINE | ID: covidwho-2098746

ABSTRACT

An important unmet need revealed by the COVID-19 pandemic is the near-real-time identification of potentially fitness-altering mutations within rapidly growing SARS-CoV-2 lineages. Although powerful molecular sequence analysis methods are available to detect and characterize patterns of natural selection within modestly sized gene-sequence datasets, the computational complexity of these methods and their sensitivity to sequencing errors render them effectively inapplicable in large-scale genomic surveillance contexts. Motivated by the need to analyze new lineage evolution in near-real time using large numbers of genomes, we developed the Rapid Assessment of Selection within CLades (RASCL) pipeline. RASCL applies state of the art phylogenetic comparative methods to evaluate selective processes acting at individual codon sites and across whole genes. RASCL is scalable and produces automatically updated regular lineage-specific selection analysis reports: even for lineages that include tens or hundreds of thousands of sampled genome sequences. Key to this performance is (i) generation of automatically subsampled high quality datasets of gene/ORF sequences drawn from a selected "query" viral lineage; (ii) contextualization of these query sequences in codon alignments that include high-quality "background" sequences representative of global SARS-CoV-2 diversity; and (iii) the extensive parallelization of a suite of computationally intensive selection analysis tests. Within hours of being deployed to analyze a novel rapidly growing lineage of interest, RASCL will begin yielding JavaScript Object Notation (JSON)-formatted reports that can be either imported into third-party analysis software or explored in standard web-browsers using the premade RASCL interactive data visualization dashboard. By enabling the rapid detection of genome sites evolving under different selective regimes, RASCL is well-suited for near-real-time monitoring of the population-level selective processes that will likely underlie the emergence of future variants of concern in measurably evolving pathogens with extensive genomic surveillance.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Pandemics , COVID-19/epidemiology , COVID-19/genetics , Phylogeny , Codon/genetics , Sequence Analysis , Genome, Viral
19.
BMC Ecol Evol ; 22(1): 123, 2022 10 28.
Article in English | MEDLINE | ID: covidwho-2098309

ABSTRACT

The genome of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) contains many insertions/deletions (indels) from the genomes of other SARS-related coronaviruses. Some of the identified indels have recently reported to involve relatively long segments of 10-300 consecutive bases and with diverse RNA sequences around gaps between virus species, both of which are different characteristics from the classical shorter in-frame indels. These non-classical complex indels have been identified in non-structural protein 3 (Nsp3), the S1 domain of the spike (S), and open reading frame 8 (ORF8). To determine whether the occurrence of these non-classical indels in specific genomic regions is ubiquitous among broad species of SARS-related coronaviruses in different animal hosts, the present study compared SARS-related coronaviruses from humans (SARS-CoV and SARS-CoV-2), bats (RaTG13 and Rc-o319), and pangolins (GX-P4L), by performing multiple sequence alignment. As a result, indel hotspots with diverse RNA sequences of different lengths between the viruses were confirmed in the Nsp2 gene (approximately 2500-2600 base positions in the overall 29,900 bases), Nsp3 gene (approximately 3000-3300 and 3800-3900 base positions), N-terminal domain of the spike protein (21,500-22,500 base positions), and ORF8 gene (27,800-28,200 base positions). Abnormally high rate of point mutations and complex indels in these regions suggest that the occurrence of mutations in these hotspots may be selectively neutral or even benefit the survival of the viruses. The presence of such indel hotspots has not been reported in different human SARS-CoV-2 strains in the last 2 years, suggesting a lower rate of indels in human SARS-CoV-2. Future studies to elucidate the mechanisms enabling the frequent development of long and complex indels in specific genomic regions of SARS-related coronaviruses would offer deeper insights into the process of viral evolution.


Subject(s)
COVID-19 , Chiroptera , SARS Virus , Animals , Humans , Open Reading Frames/genetics , SARS-CoV-2/genetics , Genome, Viral/genetics , SARS Virus/genetics , Evolution, Molecular , Phylogeny , COVID-19/genetics , Chiroptera/genetics , Pangolins
20.
Viruses ; 14(11)2022 Oct 27.
Article in English | MEDLINE | ID: covidwho-2090361

ABSTRACT

Due to the emergence of new variants of the SARS-CoV-2 coronavirus, the question of how the viral genomes evolved, leading to the formation of highly infectious strains, becomes particularly important. Three major emergent strains, Alpha, Beta and Delta, characterized by a significant number of missense mutations, provide a natural test field. We accumulated and aligned 4.7 million SARS-CoV-2 genomes from the GISAID database and carried out a comprehensive set of analyses. This collection covers the period until the end of October 2021, i.e., the beginnings of the Omicron variant. First, we explored combinatorial complexity of the genomic variants emerging and their timing, indicating very strong, albeit hidden, selection forces. Our analyses show that the mutations that define variants of concern did not arise gradually but rather co-evolved rapidly, leading to the emergence of the full variant strain. To explore in more detail the evolutionary forces at work, we developed time trajectories of mutations at all 29,903 sites of the SARS-CoV-2 genome, week by week, and stratified them into trends related to (i) point substitutions, (ii) deletions and (iii) non-sequenceable regions. We focused on classifying the genetic forces active at different ranges of the mutational spectrum. We observed the agreement of the lowest-frequency mutation spectrum with the Griffiths-Tavaré theory, under the Infinite Sites Model and neutrality. If we widen the frequency range, we observe the site frequency spectra much more consistently with the Tung-Durrett model assuming clone competition and selection. The coefficients of the fitting model indicate the possibility of selection acting to promote gradual growth slowdown, as observed in the history of the variants of concern. These results add up to a model of genomic evolution, which partly fits into the classical drift barrier ideas. Certain observations, such as mutation "bands" persistent over the epidemic history, suggest contribution of genetic forces different from mutation, drift and selection, including recombination or other genome transformations. In addition, we show that a "toy" mathematical model can qualitatively reproduce how new variants (clones) stem from rare advantageous driver mutations, and then acquire neutral or disadvantageous passenger mutations which gradually reduce their fitness so they can be then outcompeted by new variants due to other driver mutations.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , COVID-19/epidemiology , Genome, Viral , Genomics , Mutation , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus , Evolution, Molecular
SELECTION OF CITATIONS
SEARCH DETAIL