Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
Sci Rep ; 12(1): 8725, 2022 May 30.
Article in English | MEDLINE | ID: covidwho-1868017

ABSTRACT

Genome variant calling is a challenging yet critical task for subsequent studies. Existing methods almost rely on high depth DNA sequencing data. Performance on low depth data drops a lot. Using public Oxford Nanopore (ONT) data of human being from the Genome in a Bottle (GIAB) Consortium, we trained a generative adversarial network for low depth variant calling. Our method, noted as LDV-Caller, can project high depth sequencing information from low depth data. It achieves 94.25% F1 score on low depth data, while the F1 score of the state-of-the-art method on two times higher depth data is 94.49%. By doing so, the price of genome-wide sequencing examination can reduce deeply. In addition, we validated the trained LDV-Caller model on 157 public Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) samples. The mean sequencing depth of these samples is 2982. The LDV-Caller yields 92.77% F1 score using only 22x sequencing depth, which demonstrates our method has potential to analyze different species with only low depth sequencing data.


Subject(s)
COVID-19 , Polymorphism, Single Nucleotide , COVID-19/genetics , Genome, Human , Humans , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods
2.
Zhejiang Da Xue Xue Bao Yi Xue Ban ; 50(6): 748-754, 2021 Dec 25.
Article in English | MEDLINE | ID: covidwho-1753705

ABSTRACT

To explore the application value of nanopore sequencing technique in the diagnosis and treatment of secondary infections in patients with severe coronavirus disease 2019 (COVID-19). A total of 77 clinical specimens from 3 patients with severe COVID-19 were collected. After heat inactivation, all samples were subjected to total nucleic acid extraction based on magnetic bead enrichment. The extracted DNA was used for DNA library construction, then nanopore real-time sequencing detection was performed. The sequencing data were subjected to Centrifuge software database species matching and R program differential analysis to obtain potential pathogen identification. Nanopore sequencing results were compared with respiratory pathogen qPCR panel screening and conventional microbiological testing results to verify the effectiveness of nanopore sequencing detection. Nanopore sequencing results showed that positive pathogen were obtained in 44 specimens (57.1%). The potential pathogens identified by nanopore sequencing included , , and , et al. , , were also detected in clinical microbiological culture-based detection; was detected in respiratory pathogen screening qPCR panel; was only detected by the nanopore sequencing technique. Comprehensive considerations with the clinical symptoms, the patient was treated with antibiotics against , and the infection was controlled. Nanopore sequencing may assist the diagnosis and treatment of severe COVID-19 patients through rapid identification of potential pathogens.


Subject(s)
COVID-19 , Coinfection , Nanopore Sequencing , Nanopores , COVID-19/diagnosis , Humans , Sequence Analysis, DNA/methods
3.
Microb Genom ; 8(3)2022 03.
Article in English | MEDLINE | ID: covidwho-1746154

ABSTRACT

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is adaptively evolving to ensure its persistence within human hosts. It is therefore necessary to continuously monitor the emergence and prevalence of novel variants that arise. Importantly, some mutations have been associated with both molecular diagnostic failures and reduced or abrogated next-generation sequencing (NGS) read coverage in some genomic regions. Such impacts are particularly problematic when they occur in genomic regions such as those that encode the spike (S) protein, which are crucial for identifying and tracking the prevalence and dissemination dynamics of concerning viral variants. Targeted Sanger sequencing presents a fast and cost-effective means to accurately extend the coverage of whole-genome sequences. We designed a custom set of primers to amplify a 401 bp segment of the receptor-binding domain (RBD) (between positions 22698 and 23098 relative to the Wuhan-Hu-1 reference). We then designed a Sanger sequencing wet-laboratory protocol. We applied the primer set and wet-laboratory protocol to sequence 222 samples that were missing positions with key mutations K417N, E484K, and N501Y due to poor coverage after NGS sequencing. Finally, we developed SeqPatcher, a Python-based computational tool to analyse the trace files yielded by Sanger sequencing to generate consensus sequences, or take preanalysed consensus sequences in fasta format, and merge them with their corresponding whole-genome assemblies. We successfully sequenced 153 samples of 222 (69 %) using Sanger sequencing and confirmed the occurrence of key beta variant mutations (K417N, E484K, N501Y) in the S genes of 142 of 153 (93 %) samples. Additionally, one sample had the Y508F mutation and four samples the S477N. Samples with RT-PCR C t scores ranging from 13.85 to 37.47 (mean=25.70) could be Sanger sequenced efficiently. These results show that our method and pipeline can be used to improve the quality of whole-genome assemblies produced using NGS and can be used with any pairs of the most used NGS and Sanger sequencing platforms.


Subject(s)
Genome, Viral , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing , Mutation
4.
Sci Rep ; 12(1): 2419, 2022 02 14.
Article in English | MEDLINE | ID: covidwho-1684100

ABSTRACT

This study aimed to develop the feasible and effective universal screening strategy of the notable SARS-CoV-2 variants by Sanger Sequencing Strategy and then practically applied it for mass screening in Hiroshima, Japan. A total of 734 samples from COVID-19 confirmed cases in Hiroshima were screened for the notable SARS-CoV-2 variants (B.1.1.7, B.1.351, P.1, B.1.617.2, B.1.617.1, C.37, B.1.1.529, etc.). The targeted spike region is amplified by nested RT-PCR using in-house designed primer set hCoV-Spike-A and standard amplification protocol. Additionally, randomly selected 96 samples were also amplified using primer sets hCoV-Spike-B and hCoV-Spike-C. The negative amplified samples were repeated for second attempt of amplification by volume-up protocol. Thereafter, the amplified products were assigned for Sanger sequencing using corresponding primers. The positive amplification rate of primer set hCoV-Spike-A, hCoV-Spike-B and hCoV-Spike-C were 87.3%, 83.3% and 93.8% respectively for standard protocol and increased to 99.6%, 95.8% and 96.9% after second attempt by volume-up protocol. The readiness of genome sequences was 96.9%, 100% and 100% respectively. Among 48 mutant isolates, 26 were B.1.1.7 (Alpha), 7 were E484K single mutation and the rest were other types of mutation. Moreover, 5 cluster cases with single mutation at N501S were firstly reported in Hiroshima. This study indicates the reliability and effectiveness of Sanger sequencing to screen large number of samples for the notable SARS-CoV-2 variants. Compared to the Next Generation Sequencing (NGS), our method introduces the feasible, universally applicable, and practically useful tool for identification of the emerging variants with less expensive and time consuming especially in those countries where the NGS is not practically available. Our method allows not only to identify the pre-existing variants but also to examine other rare type of mutation or newly emerged variants and is crucial for prevention and control of pandemic.


Subject(s)
COVID-19/diagnosis , Mass Screening/methods , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , Spike Glycoprotein, Coronavirus/genetics , Amino Acid Sequence , COVID-19/epidemiology , COVID-19/virology , Feasibility Studies , High-Throughput Nucleotide Sequencing/methods , Humans , Japan/epidemiology , Pandemics/prevention & control , Reproducibility of Results , SARS-CoV-2/physiology , Sensitivity and Specificity , Sequence Homology, Amino Acid
5.
Gene ; 813: 146113, 2022 Mar 01.
Article in English | MEDLINE | ID: covidwho-1616498

ABSTRACT

Since late 2019, when SARS-CoV-2 was reported at Wuhan, several sequence analyses have been performed and SARS-CoV-2 genome sequences have been submitted in various databases. Moreover, the impact of these variants on infectivity and response to neutralizing antibodies has been assessed. In the present study, we retrieved a total number of 176 complete and high-quality S glycoprotein sequences of Iranian SARS-COV-2 in public database of the GISAID and GenBank from April 2020 up to May 2021. Then, we identified the number of variables, singleton and parsimony informative sites at both gene and protein levels and discussed the possible functional consequences of important mutations on the infectivity and response to neutralizing antibodies. Phylogenetic tree was constructed to represent the relationship between Iranian SARS-COV2 and variants of concern (VOC), variants of interest (VOI) and reference sequence. We found that the four current VOCs - Alpha, Beta, Gamma and Delta - are circulated in different regions in Iran. The Delta variant is notably more transmissible than other variants, and is expected to become a dominant variant. However, some of the Delta variants in Iran carry an additional mutation, namely E1202Q in the HR2 subdomain that might confer an advantage to viral/cell membrane fusion process. We also observed some more common mutations such as an N-terminal domain (NTD) deletion at position I210 and P863H in fusion peptide-heptad repeat 1 span region in Iranian SARS-COV-2. The reported mutations in the current project have practical significance in prediction of disease spread as well as design of vaccines and drugs.


Subject(s)
COVID-19/genetics , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/genetics , Antibodies, Neutralizing/immunology , Antibodies, Viral/genetics , COVID-19/epidemiology , COVID-19/metabolism , Databases, Genetic , Humans , Iran/epidemiology , Mutation/genetics , Phylogeny , Protein Binding , RNA, Viral , SARS-CoV-2/metabolism , SARS-CoV-2/pathogenicity , Sequence Analysis, DNA/methods , Spike Glycoprotein, Coronavirus/metabolism
6.
Viruses ; 13(12)2021 12 18.
Article in English | MEDLINE | ID: covidwho-1580423

ABSTRACT

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is the causal agent of the COVID-19 pandemic that emerged in late 2019. The outbreak of variants with mutations in the region encoding the spike protein S1 sub-unit that can make them more resistant to neutralizing or monoclonal antibodies is the main point of the current monitoring. This study examines the feasibility of predicting the variant lineage and monitoring the appearance of reported mutations by sequencing only the region encoding the S1 domain by Pacific Bioscience Single Molecule Real-Time sequencing (PacBio SMRT). Using the PacBio SMRT system, we successfully sequenced 186 of the 200 samples previously sequenced with the Illumina COVIDSeq (whole genome) system. PacBio SMRT detected mutations in the S1 domain that were missed by the COVIDseq system in 27/186 samples (14.5%), due to amplification failure. These missing positions included mutations that are decisive for lineage assignation, such as G142D (n = 11), N501Y (n = 6), or E484K (n = 2). The lineage of 172/186 (92.5%) samples was accurately determined by analyzing the region encoding the S1 domain with a pipeline that uses key positions in S1. Thus, the PacBio SMRT protocol is appropriate for determining virus lineages and detecting key mutations.


Subject(s)
SARS-CoV-2/genetics , Sequence Analysis, DNA , Spike Glycoprotein, Coronavirus/genetics , COVID-19/virology , Genotype , Humans , Mutation , Protein Interaction Domains and Motifs/genetics , SARS-CoV-2/classification , Sequence Analysis, DNA/methods
7.
Genes (Basel) ; 12(11)2021 11 18.
Article in English | MEDLINE | ID: covidwho-1533884

ABSTRACT

Multiple sequence alignment (MSA) is the basis for almost all sequence comparison and molecular phylogenetic inferences. Large-scale genomic analyses are typically associated with automated progressive MSA without subsequent manual adjustment, which itself is often error-prone because of the lack of a consistent and explicit criterion. Here, I outlined several commonly encountered alignment errors that cannot be avoided by progressive MSA for nucleotide, amino acid, and codon sequences. Methods that could be automated to fix such alignment errors were then presented. I emphasized the utility of position weight matrix as a new tool for MSA refinement and illustrated its usage by refining the MSA of nucleotide and amino acid sequences. The main advantages of the position weight matrix approach include (1) its use of information from all sequences, in contrast to other commonly used methods based on pairwise alignment scores and inconsistency measures, and (2) its speedy computation, making it suitable for a large number of long viral genomic sequences.


Subject(s)
Automation, Laboratory/methods , Genomics/methods , Sequence Alignment/methods , Algorithms , Animals , Automation, Laboratory/standards , Genomics/standards , Humans , Phylogeny , Sensitivity and Specificity , Sequence Alignment/standards , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/standards , Sequence Analysis, Protein/methods , Sequence Analysis, Protein/standards
8.
NPJ Biofilms Microbiomes ; 7(1): 81, 2021 11 18.
Article in English | MEDLINE | ID: covidwho-1526078

ABSTRACT

The oral microbiome has been connected with lung health and may be of significance in the progression of SARS-CoV-2 infection. Saliva-based SARS-CoV-2 tests provide the opportunity to leverage stored samples for assessing the oral microbiome. However, these collection kits have not been tested for their accuracy in measuring the oral microbiome. Saliva is highly enriched with human DNA and reducing it prior to shotgun sequencing may increase the depth of bacterial reads. We examined both the effect of saliva collection method and sequence processing on measurement of microbiome depth and diversity by 16S rRNA gene amplicon and shotgun metagenomics. We collected 56 samples from 22 subjects. Each subject provided saliva samples with and without preservative, and a subset provided a second set of samples the following day. 16S rRNA gene (V4) sequencing was performed on all samples, and shotgun metagenomics was performed on a subset of samples collected with preservative with and without human DNA depletion before sequencing. We observed that the beta diversity distances within subjects over time was smaller than between unrelated subjects, and distances within subjects were smaller in samples collected with preservative. Samples collected with preservative had higher alpha diversity measuring both richness and evenness. Human DNA depletion before extraction and shotgun sequencing yielded higher total and relative reads mapping to bacterial sequences. We conclude that collecting saliva with preservative may provide more consistent measures of the oral microbiome and depleting human DNA increases yield of bacterial sequences.


Subject(s)
Microbiota/genetics , Saliva/microbiology , Adult , Bacteria/genetics , COVID-19/genetics , DNA/genetics , DNA, Bacterial/genetics , Female , Humans , Male , Metagenome/genetics , Metagenomics/methods , Middle Aged , RNA, Ribosomal, 16S/genetics , SARS-CoV-2/pathogenicity , Sequence Analysis, DNA/methods
9.
Infect Genet Evol ; 96: 105106, 2021 12.
Article in English | MEDLINE | ID: covidwho-1506080

ABSTRACT

Coronaviruses (especially SARS-CoV-2) are characterized by rapid mutation and wide spread. As these characteristics easily lead to global pandemics, studying the evolutionary relationship between viruses is essential for clinical diagnosis. DNA sequencing has played an important role in evolutionary analysis. Recent alignment-free methods can overcome the problems of traditional alignment-based methods, which consume both time and space. This paper proposes a novel alignment-free method called the correlation coefficient feature vector (CCFV), which defines a correlation measure of the L-step delay of a nucleotide location from its location in the original DNA sequence. The numerical feature is a 16×L-dimensional numerical vector describing the distribution characteristics of the nucleotide positions in a DNA sequence. The proposed L-step delay correlation measure is interestingly related to some types of L+1 spaced mers. Unlike traditional gene comparison, our method avoids the computational complexity of multiple sequence alignment, and hence improves the speed of sequence comparison. Our method is applied to evolutionary analysis of the common human viruses including SARS-CoV-2, Dengue virus, Hepatitis B virus, and human rhinovirus and achieves the same or even better results than alignment-based methods. Especially for SARS-CoV-2, our method also confirms that bats are potential intermediate hosts of SARS-CoV-2.


Subject(s)
Genome, Viral/genetics , Phylogeny , Sequence Analysis, DNA/methods , Coronavirus/genetics , Dengue Virus/genetics , Hepatitis B/genetics , Humans , Models, Genetic , Rhinovirus/genetics , SARS-CoV-2/genetics , Sequence Alignment
10.
Viruses ; 13(10)2021 09 30.
Article in English | MEDLINE | ID: covidwho-1481008

ABSTRACT

Measles virus (MeV) genotype B3 is one globally significant circulating genotype. Here, we present a systematic description of long-term evolutionary characterizations of the MeV genotype B3's hemagglutinin (H) gene in the elimination era. Our results show that the B3 H gene can be divided into two main sub-genotypes, and the highest intra-genotypic diversity was observed in 2004. MeV genotype B3's H gene diverged in 1976; its overall nucleotide substitution rate is estimated to be 5.697 × 10-4 substitutions/site/year, and is slowing down. The amino acid substitution rate of genotype B3's H gene is also decreasing, and the mean effective population size has been in a downward trend since 2000. Selection pressure analysis only recognized a few sites under positive selection, and the number of positive selection sites is getting smaller. All of these observations may reveal that genotype B3's H gene is not under strong selection pressure, and is becoming increasingly conservative. MeV H-gene or whole-genome sequencing should be routine, so as to better elucidate the molecular epidemiology of MeV in the future.


Subject(s)
Hemagglutinins, Viral/genetics , Measles virus/genetics , China , Evolution, Molecular , Genetic Variation/genetics , Genotype , Hemagglutinins/genetics , Humans , Measles/virology , Molecular Epidemiology/methods , Phylogeny , Sequence Analysis, DNA/methods
11.
Viruses ; 13(10)2021 09 29.
Article in English | MEDLINE | ID: covidwho-1441884

ABSTRACT

Bats have been identified as natural reservoirs of a variety of coronaviruses. They harbor at least 19 of the 33 defined species of alpha- and betacoronaviruses. Previously, the bat coronavirus HKU10 was found in two bat species of different suborders, Rousettus leschenaultia and Hipposideros pomona, in south China. However, its geographic distribution and evolution history are not fully investigated. Here, we screened this viral species by a nested reverse transcriptase PCR in our archived samples collected over 10 years from 25 provinces of China and one province of Laos. From 8004 bat fecal samples, 26 were found to be positive for bat coronavirus HKU10 (BtCoV HKU10). New habitats of BtCoV HKU10 were found in the Yunnan, Guangxi, and Hainan Provinces of China, and Louang Namtha Province in Laos. In addition to H. pomona, BtCoV HKU10 variants were found circulating in Aselliscus stoliczkanus and Hipposideros larvatus. We sequenced full-length genomes of 17 newly discovered BtCoV HKU10 strains and compared them with previously published sequences. Our results revealed a much higher genetic diversity of BtCoV HKU10, particularly in spike genes and accessory genes. Besides the two previously reported lineages, we found six novel lineages in their new habitats, three of which were located in Yunnan province. The genotypes of these viruses are closely related to sampling locations based on polyproteins, and correlated to bat species based on spike genes. Combining phylogenetic analysis, selective pressure, and molecular-clock calculation, we demonstrated that Yunnan bats harbor a gene pool of BtCoV HKU10, with H. pomona as a natural reservoir. The cell tropism test using spike-pseudotyped lentivirus system showed that BtCoV HKU10 could enter cells from human and bat, suggesting a potential interspecies spillover. Continuous studies on these bat coronaviruses will expand our understanding of the evolution and genetic diversity of coronaviruses, and provide a prewarning of potential zoonotic diseases from bats.


Subject(s)
Alphacoronavirus/genetics , Chiroptera/virology , Alphacoronavirus/pathogenicity , Animals , Base Sequence/genetics , Biological Evolution , China , Chiroptera/genetics , Coronavirus/genetics , Coronavirus/pathogenicity , Coronavirus Infections/virology , Evolution, Molecular , Genetic Variation/genetics , Genome, Viral/genetics , Genotype , Phylogeny , Sequence Analysis, DNA/methods , Viral Proteins/genetics
13.
Nucleic Acids Res ; 49(D1): D92-D96, 2021 01 08.
Article in English | MEDLINE | ID: covidwho-1387961

ABSTRACT

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 9.9 trillion base pairs from over 2.1 billion nucleotide sequences for 478 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. Recent updates include new resources for data from the SARS-CoV-2 virus, updates to the NCBI Submission Portal and associated submission wizards for dengue and SARS-CoV-2 viruses, new taxonomy queries for viruses and prokaryotes, and simplified submission processes for EST and GSS sequences.


Subject(s)
Computational Biology/statistics & numerical data , Databases, Nucleic Acid , Genomics/methods , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , Animals , COVID-19/epidemiology , COVID-19/virology , Computational Biology/methods , Humans , Information Storage and Retrieval/methods , Internet , Molecular Sequence Annotation/methods , Pandemics
14.
FEBS Open Bio ; 11(9): 2441-2452, 2021 09.
Article in English | MEDLINE | ID: covidwho-1380363

ABSTRACT

Whole genome and exome sequencing (WGS/WES) are the most popular next-generation sequencing (NGS) methodologies and are at present often used to detect rare and common genetic variants of clinical significance. We emphasize that automated sequence data processing, management, and visualization should be an indispensable component of modern WGS and WES data analysis for sequence assembly, variant detection (SNPs, SVs), imputation, and resolution of haplotypes. In this manuscript, we present a newly developed findable, accessible, interoperable, and reusable (FAIR) bioinformatics-genomics pipeline Java based Whole Genome/Exome Sequence Data Processing Pipeline (JWES) for efficient variant discovery and interpretation, and big data modeling and visualization. JWES is a cross-platform, user-friendly, product line application, that entails three modules: (a) data processing, (b) storage, and (c) visualization. The data processing module performs a series of different tasks for variant calling, the data storage module efficiently manages high-volume gene-variant data, and the data visualization module supports variant data interpretation with Circos graphs. The performance of JWES was tested and validated in-house with different experiments, using Microsoft Windows, macOS Big Sur, and UNIX operating systems. JWES is an open-source and freely available pipeline, allowing scientists to take full advantage of all the computing resources available, without requiring much computer science knowledge. We have successfully applied JWES for processing, management, and gene-variant discovery, annotation, prediction, and genotyping of WGS and WES data to analyze variable complex disorders. In summary, we report the performance of JWES with some reproducible case studies, using open access and in-house generated, high-quality datasets.


Subject(s)
Computational Biology/methods , Exome , Genome , Genomics/methods , Sequence Analysis, DNA/methods , Software , Data Management , Databases, Genetic , Genetic Variation , Humans , Molecular Sequence Annotation , Reproducibility of Results , Whole Exome Sequencing , Whole Genome Sequencing , Workflow
15.
PLoS One ; 16(8): e0244468, 2021.
Article in English | MEDLINE | ID: covidwho-1371999

ABSTRACT

The newly emerged and rapidly spreading SARS-CoV-2 causes coronavirus disease 2019 (COVID-19). To facilitate a deeper understanding of the viral biology we developed a capture sequencing methodology to generate SARS-CoV-2 genomic and transcriptome sequences from infected patients. We utilized an oligonucleotide probe-set representing the full-length genome to obtain both genomic and transcriptome (subgenomic open reading frames [ORFs]) sequences from 45 SARS-CoV-2 clinical samples with varying viral titers. For samples with higher viral loads (cycle threshold value under 33, based on the CDC qPCR assay) complete genomes were generated. Analysis of junction reads revealed regions of differential transcriptional activity among samples. Mixed allelic frequencies along the 20kb ORF1ab gene in one sample, suggested the presence of a defective viral RNA species subpopulation maintained in mixture with functional RNA in one sample. The associated workflow is straightforward, and hybridization-based capture offers an effective and scalable approach for sequencing SARS-CoV-2 from patient samples.


Subject(s)
COVID-19/pathology , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , COVID-19/virology , DNA, Complementary/chemistry , DNA, Complementary/metabolism , Gene Frequency , Genetic Variation , Genome, Viral , Humans , Open Reading Frames/genetics , RNA, Viral/genetics , RNA, Viral/metabolism , Real-Time Polymerase Chain Reaction , SARS-CoV-2/isolation & purification , Viral Load
16.
Sci Rep ; 11(1): 15869, 2021 08 05.
Article in English | MEDLINE | ID: covidwho-1345586

ABSTRACT

Since December 2019, a novel coronavirus responsible for a severe acute respiratory syndrome (SARS-CoV-2) is accountable for a major pandemic situation. The emergence of the B.1.1.7 strain, as a highly transmissible variant has accelerated the world-wide interest in tracking SARS-CoV-2 variants' occurrence. Similarly, other extremely infectious variants, were described and further others are expected to be discovered due to the long period of time on which the pandemic situation is lasting. All described SARS-CoV-2 variants present several mutations within the gene encoding the Spike protein, involved in host receptor recognition and entry into the cell. Hence, instead of sequencing the whole viral genome for variants' tracking, herein we propose to focus on the SPIKE region to increase the number of candidate samples to screen at once; an essential aspect to accelerate diagnostics, but also variants' emergence/progression surveillance. This proof of concept study accomplishes both at once, population-scale diagnostics and variants' tracking. This strategy relies on (1) the use of the portable MinION DNA sequencer; (2) a DNA barcoding and a SPIKE gene-centered variant's tracking, increasing the number of candidates per assay; and (3) a real-time diagnostics and variant's tracking monitoring thanks to our software RETIVAD. This strategy represents an optimal solution for addressing the current needs on SARS-CoV-2 progression surveillance, notably due to its affordable implementation, allowing its implantation even in remote places over the world.


Subject(s)
COVID-19/diagnosis , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , COVID-19/virology , COVID-19 Nucleic Acid Testing/instrumentation , COVID-19 Nucleic Acid Testing/methods , Genome, Viral , Humans , Nanopores , RNA, Viral/genetics , Sequence Analysis, DNA/instrumentation , Spike Glycoprotein, Coronavirus/genetics
17.
Genomics ; 113(5): 3174-3184, 2021 09.
Article in English | MEDLINE | ID: covidwho-1320193

ABSTRACT

As mutations in SARS-CoV-2 virus accumulate rapidly, novel primers that amplify this virus sensitively and specifically are in demand. We have developed a webserver named CoVrimer by which users can search for and align existing or newly designed conserved/degenerate primer pair sequences against the viral genome and assess the mutation load of both primers and amplicons. CoVrimer uses mutation data obtained from an online platform established by NGDC-CNCB (12 May 2021) to identify genomic regions, either conserved or with low levels of mutations, from which potential primer pairs are designed and provided to the user for filtering based on generalized and SARS-CoV-2 specific parameters. Alignments of primers and probes can be visualized with respect to the reference genome, indicating variant details and the level of conservation. Consequently, CoVrimer is likely to help researchers with the challenges posed by viral evolution and is freely available at http://konulabapps.bilkent.edu.tr:3838/CoVrimer/.


Subject(s)
DNA Primers/chemistry , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , Software , Conserved Sequence , DNA Primers/genetics , Genome, Viral , Mutation
18.
Sci Rep ; 11(1): 14558, 2021 07 15.
Article in English | MEDLINE | ID: covidwho-1315608

ABSTRACT

Whereas accelerated attention beclouded early stages of the coronavirus spread, knowledge of actual pathogenicity and origin of possible sub-strains remained unclear. By harvesting the Global initiative on Sharing All Influenza Data (GISAID) database ( https://www.gisaid.org/ ), between December 2019 and January 15, 2021, a total of 8864 human SARS-CoV-2 complete genome sequences processed by gender, across 6 continents (88 countries) of the world, Antarctica exempt, were analyzed. We hypothesized that data speak for itself and can discern true and explainable patterns of the disease. Identical genome diversity and pattern correlates analysis performed using a hybrid of biotechnology and machine learning methods corroborate the emergence of inter- and intra- SARS-CoV-2 sub-strains transmission and sustain an increase in sub-strains within the various continents, with nucleotide mutations dynamically varying between individuals in close association with the virus as it adapts to its host/environment. Interestingly, some viral sub-strain patterns progressively transformed into new sub-strain clusters indicating varying amino acid, and strong nucleotide association derived from same lineage. A novel cognitive approach to knowledge mining helped the discovery of transmission routes and seamless contact tracing protocol. Our classification results were better than state-of-the-art methods, indicating a more robust system for predicting emerging or new viral sub-strain(s). The results therefore offer explanations for the growing concerns about the virus and its next wave(s). A future direction of this work is a defuzzification of confusable pattern clusters for precise intra-country SARS-CoV-2 sub-strains analytics.


Subject(s)
COVID-19/virology , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , COVID-19/epidemiology , COVID-19/transmission , Computational Biology/methods , DNA, Viral/genetics , Databases, Genetic , Forecasting/methods , Genome, Viral , Humans , Machine Learning , Mutation , Phylogeny , SARS-CoV-2/classification , SARS-CoV-2/pathogenicity , Whole Genome Sequencing/methods
19.
Genes (Basel) ; 11(10)2020 09 24.
Article in English | MEDLINE | ID: covidwho-1298143

ABSTRACT

Since the release of the MinION sequencer in 2014, it has been applied to great effect in the remotest and harshest of environments, and even in space. One of the most common applications of MinION is for nanopore-based DNA barcoding in situ for species identification and discovery, yet the existing sample capability is limited (n ≤ 10). Here, we assembled a portable sequencing setup comprising the BentoLab and MinION and developed a workflow capable of processing 32 samples simultaneously. We demonstrated this enhanced capability out at sea, where we collected samples and barcoded them onboard a dive vessel moored off Sisters' Islands Marine Park, Singapore. In under 9 h, we generated 105 MinION barcodes, of which 19 belonged to fresh metazoans processed immediately after collection. Our setup is thus viable and would greatly fortify existing portable DNA barcoding capabilities. We also tested the performance of the newly released R10.3 nanopore flow cell for DNA barcoding, and showed that the barcodes generated were ~99.9% accurate when compared to Illumina references. A total of 80% of the R10.3 nanopore barcodes also had zero base ambiguities, compared to 50-60% for R9.4.1, suggesting an improved homopolymer resolution and making the use of R10.3 highly recommended.


Subject(s)
Aquatic Organisms/genetics , Coral Reefs , DNA Barcoding, Taxonomic/methods , High-Throughput Nucleotide Sequencing/methods , Nanopores , Sequence Analysis, DNA/methods , Software , Animals , Biodiversity
20.
PLoS One ; 16(6): e0252534, 2021.
Article in English | MEDLINE | ID: covidwho-1270459

ABSTRACT

Many recent disease outbreaks in humans had a zoonotic virus etiology. Bats in particular have been recognized as reservoirs to a large variety of viruses with the potential to cross-species transmission. In order to assess the risk of bats in Switzerland for such transmissions, we determined the virome of tissue and fecal samples of 14 native and 4 migrating bat species. In total, sequences belonging to 39 different virus families, 16 of which are known to infect vertebrates, were detected. Contigs of coronaviruses, adenoviruses, hepeviruses, rotaviruses A and H, and parvoviruses with potential zoonotic risk were characterized in more detail. Most interestingly, in a ground stool sample of a Vespertilio murinus colony an almost complete genome of a Middle East respiratory syndrome-related coronavirus (MERS-CoV) was detected by Next generation sequencing and confirmed by PCR. In conclusion, bats in Switzerland naturally harbour many different viruses. Metagenomic analyses of non-invasive samples like ground stool may support effective surveillance and early detection of viral zoonoses.


Subject(s)
Chiroptera/virology , Feces/virology , Metagenomics/methods , Virome/genetics , Viruses/genetics , Zoonoses/virology , Adenoviridae/classification , Adenoviridae/genetics , Animals , Chiroptera/classification , Disease Reservoirs/virology , Genetic Variation , Genome, Viral/genetics , Hepevirus/classification , Hepevirus/genetics , Humans , Middle East Respiratory Syndrome Coronavirus/classification , Middle East Respiratory Syndrome Coronavirus/genetics , Phylogeny , Rotavirus/classification , Rotavirus/genetics , Sequence Analysis, DNA/methods , Switzerland , Viruses/classification
SELECTION OF CITATIONS
SEARCH DETAIL