Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
BMC Bioinformatics ; 23(1): 187, 2022 May 17.
Article in English | MEDLINE | ID: covidwho-1846792

ABSTRACT

The rapid global spread and dissemination of SARS-CoV-2 has provided the virus with numerous opportunities to develop several variants. Thus, it is critical to determine the degree of the variations and in which part of the virus those variations occurred. Therefore, in this study, methods that could be used to vectorize the sequence data, perform clustering analysis, and visualize the results were proposed using machine learning methods. To conduct this study, a total of 224,073 cases of SARS-CoV-2 sequence data were collected through NCBI and GISAID, and the data were visualized using dimensionality reduction and clustering analysis models such as T-SNE and DBSCAN. The SARS-CoV-2 virus, which was first detected, was distinguished from different variations, including Omicron and Delta, in the cluster results. Furthermore, it was possible to examine which codon changes in the spike protein caused the variants to be distinguished using feature importance extraction models such as Random Forest or Shapely Value. The proposed method has the advantage of being able to analyse and visualize a large amount of data at once compared to the existing tree-based sequence data analysis. The proposed method was able to identify and visualize significant changes between the SARS-CoV-2 virus, which was first detected in Wuhan, China, in December 2019, and the newly formed mutant virus group. As a result of clustering analysis using sequence data, it was possible to confirm the formation of clusters among various variants in a two-dimensional graph, and by extracting the importance of variables, it was possible to confirm which codon changes played a major role in distinguishing variants. Furthermore, since the proposed method can handle a variety of data sequences, it can be used for all kinds of diseases, including influenza and SARS-CoV-2. Therefore, the proposed method has the potential to become widely used for the effective analysis of disease variations.


Subject(s)
COVID-19 , Magnoliopsida , Cluster Analysis , Codon , Machine Learning , SARS-CoV-2/genetics
2.
Viruses ; 12(5)2020 04 30.
Article in English | MEDLINE | ID: covidwho-1726009

ABSTRACT

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which first occurred in Wuhan (China) in December of 2019, causes a severe acute respiratory illness with a high mortality rate, and has spread around the world. To gain an understanding of the evolution of the newly emerging SARS-CoV-2, we herein analyzed the codon usage pattern of SARS-CoV-2. For this purpose, we compared the codon usage of SARS-CoV-2 with that of other viruses belonging to the subfamily of Orthocoronavirinae. We found that SARS-CoV-2 has a high AU content that strongly influences its codon usage, which appears to be better adapted to the human host. We also studied the evolutionary pressures that influence the codon usage of five conserved coronavirus genes encoding the viral replicase, spike, envelope, membrane and nucleocapsid proteins. We found different patterns of both mutational bias and natural selection that affect the codon usage of these genes. Moreover, we show here that the two integral membrane proteins (matrix and envelope) tend to evolve slowly by accumulating nucleotide mutations on their corresponding genes. Conversely, genes encoding nucleocapsid (N), viral replicase and spike proteins (S), although they are regarded as are important targets for the development of vaccines and antiviral drugs, tend to evolve faster in comparison to the two genes mentioned above. Overall, our results suggest that the higher divergence observed for the latter three genes could represent a significant barrier in the development of antiviral therapeutics against SARS-CoV-2.


Subject(s)
Betacoronavirus/genetics , Codon , Coronavirus/genetics , Genome, Viral , Base Composition , Betacoronavirus/chemistry , Betacoronavirus/physiology , Biological Evolution , Coronavirus/classification , Genes, Viral , Host Specificity , Mutation , Phylogeny , SARS-CoV-2
3.
Biophys Chem ; 285: 106780, 2022 06.
Article in English | MEDLINE | ID: covidwho-1693833

ABSTRACT

Messenger RNAs (mRNAs) serve as blueprints for protein synthesis by the molecular machine the ribosome. The ribosome relies on hydrogen bonding interactions between adaptor aminoacyl-transfer RNA molecules and mRNAs to ensure the rapid and faithful translation of the genetic code into protein. There is a growing body of evidence suggesting that chemical modifications to mRNA nucleosides impact the speed and accuracy of protein synthesis by the ribosome. Modulations in translation rates have downstream effects beyond protein production, influencing protein folding and mRNA stability. Given the prevalence of such modifications in mRNA coding regions, it is imperative to understand the consequences of individual modifications on translation. In this review we present the current state of our knowledge regarding how individual mRNA modifications influence ribosome function. Our comprehensive comparison of the impacts of 16 different mRNA modifications on translation reveals that most modifications can alter the elongation step in the protein synthesis pathway. Additionally, we discuss the context dependence of these effects, highlighting the necessity of further study to uncover the rules that govern how any given chemical modification in an mRNA codon is read by the ribosome.


Subject(s)
Peptide Chain Elongation, Translational , Protein Biosynthesis , Codon/analysis , Codon/metabolism , Proteins/metabolism , RNA Stability , RNA, Messenger/chemistry , RNA, Messenger/genetics , RNA, Messenger/metabolism , Ribosomes/chemistry , Ribosomes/genetics , Ribosomes/metabolism
4.
Microbiol Spectr ; 10(1): e0165521, 2022 02 23.
Article in English | MEDLINE | ID: covidwho-1673364

ABSTRACT

Although lessons have been learned from previous severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) outbreaks, the rapid evolution of the viruses means that future outbreaks of a much larger scale are possible, as shown by the current coronavirus disease 2019 (COVID-19) outbreak. Therefore, it is necessary to better understand the evolution of coronaviruses as well as viruses in general. This study reports a comparative analysis of the amino acid usage within several key viral families and genera that are prone to triggering outbreaks, including coronavirus (severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2], SARS-CoV, MERS-CoV, human coronavirus-HKU1 [HCoV-HKU1], HCoV-OC43, HCoV-NL63, and HCoV-229E), influenza A (H1N1 and H3N2), flavivirus (dengue virus serotypes 1 to 4 and Zika) and ebolavirus (Zaire, Sudan, and Bundibugyo ebolavirus). Our analysis reveals that the distribution of amino acid usage in the viral genome is constrained to follow a linear order, and the distribution remains closely related to the viral species within the family or genus. This constraint can be adapted to predict viral mutations and future variants of concern. By studying previous SARS and MERS outbreaks, we have adapted this naturally occurring pattern to determine that although pangolin plays a role in the outbreak of COVID-19, it may not be the sole agent as an intermediate animal. In addition to this study, our findings contribute to the understanding of viral mutations for subsequent development of vaccines and toward developing a model to determine the source of the outbreak. IMPORTANCE This study reports a comparative analysis of amino acid usage within several key viral genera that are prone to triggering outbreaks. Interestingly, there is evidence that the amino acid usage within the viral genomes is not random but in a linear order.


Subject(s)
Coronavirus/genetics , Ebolavirus/genetics , Evolution, Molecular , Flavivirus/genetics , Influenza A Virus, H1N1 Subtype/genetics , Influenza A Virus, H3N2 Subtype/genetics , Codon , Coronavirus/classification , Genome, Viral , Humans , Linear Models , Mutation , SARS-CoV-2/genetics , Virus Diseases/virology
5.
Int J Biol Macromol ; 204: 356-363, 2022 Apr 15.
Article in English | MEDLINE | ID: covidwho-1670549

ABSTRACT

Infections caused by SARS-CoV-2 have brought great harm to human health. After transmission for over two years, SARS-CoV-2 has diverged greatly and formed dozens of different lineages. Understanding the trend of its genome evolution could help foresee difficulties in controlling transmission of the virus. In this study, we conducted an extensive monthly survey and in-depth analysis on variations of nucleotide, amino acid and codon numbers in 311,260 virus samples collected till January 2022. The results demonstrate that the evolution of SARS-CoV-2 is toward increasing U-content and reducing genome-size. C, G and A to U mutations have all contributed to this U-content increase. Mutations of C, G and A at codon position 1, 2 or 3 have no significant difference in most SARS-CoV-2 lineages. Current viruses are more cryptic and more efficient in replication, and are thus less virulent yet more infectious. Delta and Omicron variants have high mutability over other lineages, bringing new threat to human health. This trend of genome evolution may provide a clue for tracing the origin of SARS-CoV-2, because ancestral viruses should have lower U-content and probably bigger genome-size.


Subject(s)
Base Composition/genetics , COVID-19/genetics , SARS-CoV-2/genetics , Base Sequence/genetics , COVID-19/transmission , China , Codon/genetics , Evolution, Molecular , Genome/genetics , Genome Size/genetics , Genome, Viral/genetics , Humans , Mutation/genetics , Phylogeny , SARS-CoV-2/pathogenicity , Uracil/metabolism
6.
Virology ; 568: 56-71, 2022 03.
Article in English | MEDLINE | ID: covidwho-1665518

ABSTRACT

SARS-CoV-2, the seventh coronavirus known to infect humans, can cause severe life-threatening respiratory pathologies. To better understand SARS-CoV-2 evolution, genome-wide analyses have been made, including the general characterization of its codons usage profile. Here we present a bioinformatic analysis of the evolution of SARS-CoV-2 codon usage over time using complete genomes collected since December 2019. Our results show that SARS-CoV-2 codon usage pattern is antagonistic to, and it is getting farther away from that of the human host. Further, a selection of deoptimized codons over time, which was accompanied by a decrease in both the codon adaptation index and the effective number of codons, was observed. All together, these findings suggest that SARS-CoV-2 could be evolving, at least from the perspective of the synonymous codon usage, to become less pathogenic.


Subject(s)
COVID-19/epidemiology , COVID-19/virology , Codon Usage , Codon , Evolution, Molecular , Pandemics , SARS-CoV-2/genetics , Betacoronavirus/classification , Betacoronavirus/genetics , Gene Expression Regulation, Viral , Genome, Viral , Genomics/methods , Humans , Open Reading Frames , Organ Specificity , Phylogeny
7.
Infect Genet Evol ; 97: 105175, 2022 01.
Article in English | MEDLINE | ID: covidwho-1555685

ABSTRACT

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spreads all over the world and brings great harm to humans in many countries. Many new SARS-CoV-2 variants appeared during its transmission. In the present study, the Delta variants (B.1.617.2) of SARS-CoV-2, which have appeared in many countries, were considered for analysis. In order to evaluate the evolutionary divergence of the Delta variants(B.1.617.2), the codon usage divergence in Delta variants (B.1.617.2) of SARS-CoV-2 was compared to that of the SARS-CoV-2 genomes emerged before June 2020. All Delta variants (B.1.617.2) and 350 early genomes of SARS-CoV-2 in the NCBI database were downloaded. Codon usage pattern including the basic composition, the GC ratio of the third position (GC3) and the first two positions (GC12) in codons, overall GC contents, the effective number of codons (ENC), the codon bias index (CBI), the relative synonymous codon usage (RSCU) values, etc., of all concerned important gene sequences were all calculated. Codon usage divergence of them was calculated via summing their standard deviations. The results suggested that base compositions in both Delta variants (B.1.617.2) of SARS-CoV-2 and the early SARS-CoV-2 genomes were similar to each other. However, the internal codon usage divergence for most genes in Delta variants (B.1.617.2) was significantly wider than that of SARS-CoV-2. The RSCU values were further used to explore the synonymous and non-synonymous mutations in the sequences of the Delta variants (B.1.617.2), and the results showed the synonymous mutations are more obvious than the non-synonymous in the concerned sequences. The related codon usage divergence analysis is helpful for further study on the adaptability and disease prognosis of the SARS-CoV-2 variants.


Subject(s)
COVID-19/epidemiology , Codon/chemistry , Genome, Viral , Mutation , SARS-CoV-2/genetics , Viral Proteins/genetics , Base Composition , COVID-19/transmission , COVID-19/virology , Databases, Genetic , Epidemiological Monitoring , Evolution, Molecular , Gene Expression , Humans , Open Reading Frames , SARS-CoV-2/classification , SARS-CoV-2/pathogenicity , Viral Proteins/metabolism
8.
Comput Biol Chem ; 95: 107594, 2021 Dec.
Article in English | MEDLINE | ID: covidwho-1482516

ABSTRACT

India, with around 15 million COVID-19 cases, recently became the second worst-hit nation by the SARS-CoV-2 pandemic. In this study, we analyzed the mutation and selection landscape of 516 unique and complete genomes of SARS-CoV-2 isolates from India in a 12-month span (from Jan to Dec 2020) to understand how the virus is evolving in this geographical region. We identified 953 genome-wide loci displaying single nucleotide polymorphism (SNP) and the Principal Component Analysis and mutation plots of the datasets indicate an increase in genetic variance with time. The 42% of the polymorphic sites display substitutions in the third nucleotide position of codons indicating that non-synonymous substitutions are more prevalent. These isolates displayed strong evidence of purifying selection in ORF1ab, spike, nucleocapsid, and membrane glycoprotein. We also find some evidence of localized positive selections ORF1ab, spike glycoprotein, and nucleocapsid. The CDSs for ORF3a, ORF8, nucleocapsid phosphoprotein, and spike glycoprotein were found to evolve at rapid rate. This study will be helpful in understanding the dynamics of rapidly evolving SARS-CoV-2.


Subject(s)
Coronavirus Nucleocapsid Proteins/genetics , Evolution, Molecular , Genome, Viral , Open Reading Frames , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/genetics , COVID-19/virology , Codon , Humans , India , Phosphoproteins/genetics , Polymorphism, Single Nucleotide
9.
Cell ; 184(20): 5189-5200.e7, 2021 09 30.
Article in English | MEDLINE | ID: covidwho-1401295

ABSTRACT

The independent emergence late in 2020 of the B.1.1.7, B.1.351, and P.1 lineages of SARS-CoV-2 prompted renewed concerns about the evolutionary capacity of this virus to overcome public health interventions and rising population immunity. Here, by examining patterns of synonymous and non-synonymous mutations that have accumulated in SARS-CoV-2 genomes since the pandemic began, we find that the emergence of these three "501Y lineages" coincided with a major global shift in the selective forces acting on various SARS-CoV-2 genes. Following their emergence, the adaptive evolution of 501Y lineage viruses has involved repeated selectively favored convergent mutations at 35 genome sites, mutations we refer to as the 501Y meta-signature. The ongoing convergence of viruses in many other lineages on this meta-signature suggests that it includes multiple mutation combinations capable of promoting the persistence of diverse SARS-CoV-2 lineages in the face of mounting host immune recognition.


Subject(s)
COVID-19/epidemiology , Evolution, Molecular , Mutation , Pandemics , SARS-CoV-2/genetics , Amino Acid Sequence/genetics , COVID-19/immunology , COVID-19/transmission , COVID-19/virology , Codon/genetics , Genes, Viral , Genetic Drift , Host Adaptation/genetics , Humans , Immune Evasion , Phylogeny , Public Health
10.
Mol Genet Genomics ; 296(1): 113-118, 2021 Jan.
Article in English | MEDLINE | ID: covidwho-1384446

ABSTRACT

To better understand the interaction between SARS-CoV-2 and human host and find potential ways to block the pandemic, one of the unresolved questions is that how the virus economically utilizes the resources of the hosts. Particularly, the tRNA pool has been adapted to the host genes. If the virus intends to translate its own RNA, then it has to compete with the abundant host mRNAs for the tRNA molecules. Translation initiation is the rate-limiting step during protein synthesis. The tRNAs carrying the initiation Methionine (iMet) recognize the start codon termed initiation ATG (iATG). Other normal Met-carrying tRNAs recognize the internal ATGs. The tAI of virus genes is significantly lower than the tAI of human genes. This disadvantage in translation elongation of viral RNAs must be compensated by more efficient initiation rates. In the human genome, the abundance of iMet-tRNAs to Met-tRNAs is five times higher than the iATG to ATG ratio. However, when SARS-CoV-2 infects human cells, the iMet has an 8.5-time enrichment to iATG. We collected 58 virus species and found that the enrichment of iMet is higher in all viruses compared to human. Our study indicates that the genome sequences of viruses like SARS-CoV-2 have the advantage of competing for the iMet-tRNAs with host mRNAs. The capture of iMet-tRNAs allows the fast translation initiation and the reproduction of virus itself, which compensates the lower tAI of viral genes. This might explain why the virus could rapidly translate its own RNA and reproduce itself from the sea of host mRNAs. Meanwhile, our study reminds the researchers not to ignore the mutations related to ATGs.


Subject(s)
Peptide Chain Initiation, Translational , RNA, Transfer, Met/metabolism , SARS-CoV-2/physiology , COVID-19/virology , Codon , Evolution, Molecular , Genome, Human , Host-Pathogen Interactions , Humans , Mutation , Protein Biosynthesis , SARS-CoV-2/genetics
11.
J Med Virol ; 93(9): 5630-5634, 2021 09.
Article in English | MEDLINE | ID: covidwho-1363678

ABSTRACT

Since the start of the coronavirus disease 2019 (COVID-19) pandemic, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has rapidly widespread worldwide becoming one of the major global public health issues of the last centuries. Currently, COVID-19 vaccine rollouts are finally upon us carrying the hope of herd immunity once a sufficient proportion of the population has been vaccinated or infected, as a new horizon. However, the emergence of SARS-CoV-2 variants brought concerns since, as the virus is exposed to environmental selection pressures, it can mutate and evolve, generating variants that may possess enhanced virulence. Codon usage analysis is a strategy to elucidate the evolutionary pressure of the viral genome suffered by different hosts, as possible cause of the emergence of new variants. Therefore, to get a better picture of the SARS-CoV-2 codon bias, we first identified the relative codon usage rate of all Betacoronaviruses lineages. Subsequently, we correlated putative cognate transfer ribonucleic acid (tRNAs) to reveal how those viruses adapt to hosts in relation to their preferred codon usage. Our analysis revealed seven preferred codons located in three different open reading frame which appear preferentially used by SARS-CoV-2. In addition, the tRNA adaptation analysis indicates a wide strategy of competition between the virus and mammalian as principal hosts highlighting the importance to reinforce the genomic monitoring to prompt identify any potential adaptation of the virus into new potential hosts which appear to be crucial to prevent and mitigate the pandemic.


Subject(s)
Betacoronavirus/genetics , Codon Usage , Coronavirus Infections/virology , Genome, Viral , Mammals , SARS-CoV-2/genetics , Animals , COVID-19 , COVID-19 Vaccines , Codon , Host-Pathogen Interactions , Humans , Mutation , Open Reading Frames , Phylogeny , RNA, Transfer
12.
Wiley Interdiscip Rev RNA ; 13(2): e1679, 2022 03.
Article in English | MEDLINE | ID: covidwho-1279257

ABSTRACT

If each of the four nucleotides were represented equally in the genomes of viruses and the hosts they infect, each base would occur at a frequency of 25%. However, this is not observed in nature. Similarly, the order of nucleotides is not random (e.g., in the human genome, guanine follows cytosine at a frequency of ~0.0125, or a quarter the number of times predicted by random representation). Codon usage and codon order are also nonrandom. Furthermore, nucleotide and codon biases vary between species. Such biases have various drivers, including cellular proteins that recognize specific patterns in nucleic acids, that once triggered, induce mutations or invoke intrinsic or innate immune responses. In this review we examine the types of compositional biases identified in viral genomes and current understanding of the evolutionary mechanisms underpinning these trends. Finally, we consider the potential for large scale synonymous recoding strategies to engineer RNA virus vaccines, including those with pandemic potential, such as influenza A virus and Severe Acute Respiratory Syndrome Coronavirus Virus 2. This article is categorized under: RNA in Disease and Development > RNA in Disease RNA Evolution and Genomics > Computational Analyses of RNA RNA Interactions with Proteins and Other Molecules > Protein-RNA Recognition.


Subject(s)
RNA Viruses , Viruses , Bias , Codon/genetics , Evolution, Molecular , Genome, Viral , Humans , Nucleotides , RNA Viruses/genetics , Viruses/genetics
13.
Biomed Res Int ; 2021: 9940010, 2021.
Article in English | MEDLINE | ID: covidwho-1259034

ABSTRACT

BACKGROUND: Respiratory syncytial virus (RSV) infection is a public health epidemic, leading to around 3 million hospitalization and about 66,000 deaths each year. It is a life-threatening condition exclusive to children with no effective treatment. METHODS: In this study, we used system-level and vaccinomics approaches to design a polyvalent vaccine for RSV, which could stimulate the immune components of the host to manage this infection. Our framework involves data accession, antigenicity and subcellular localization analysis, T cell epitope prediction, proteasomal and conservancy evaluation, host-pathogen-protein interactions, pathway studies, and in silico binding affinity analysis. RESULTS: We found glycoprotein (G), fusion protein (F), and small hydrophobic protein (SH) of RSV as potential vaccine candidates. Of these proteins (G, F, and SH), we found 9 epitopes for multiple alleles of MHC classes I and II bear significant binding affinity. These potential epitopes were linked to form a polyvalent construct using AAY, GPGPG linkers, and cholera toxin B adjuvant at N-terminal with a 23.9 kDa molecular weight of 224 amino acid residues. The final construct was a stable, immunogenic, and nonallergenic protein containing cleavage sites, TAP transport efficiency, posttranslation shifts, and CTL epitopes. The molecular docking indicated the optimum binding affinity of RSV polyvalent construct with MHC molecules (-12.49 and -10.48 kcal/mol for MHC classes I and II, respectively). This interaction showed that a polyvalent construct could manage and control this disease. CONCLUSION: Our vaccinomics and system-level investigation could be appropriate to trigger the host immune system to prevent RSV infection.


Subject(s)
Computational Biology/methods , Respiratory Syncytial Virus Infections/prevention & control , Respiratory Syncytial Virus, Human , Vaccines, Combined/therapeutic use , Alleles , Antigens , Codon , Computer Simulation , Epitopes , Epitopes, T-Lymphocyte , Glycoproteins/chemistry , Histocompatibility Antigens Class I , Histocompatibility Antigens Class II , Hospitalization , Humans , Immune System , Molecular Docking Simulation , Proteasome Endopeptidase Complex , Protein Interaction Mapping , Proteomics , T-Lymphocytes/immunology , Vaccines , Viral Fusion Proteins/chemistry
14.
Genomics ; 113(4): 2177-2188, 2021 07.
Article in English | MEDLINE | ID: covidwho-1233643

ABSTRACT

The prevailing COVID-19 pandemic has drawn the attention of the scientific community to study the evolutionary origin of Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2). This study is a comprehensive quantitative analysis of the protein-coding sequences of seven human coronaviruses (HCoVs) to decipher the nucleotide sequence variability and codon usage patterns. It is essential to understand the survival ability of the viruses, their adaptation to hosts, and their evolution. The current analysis revealed a high abundance of the relative dinucleotide (odds ratio), GC and CT pairs in the first and last two codon positions, respectively, as well as a low abundance of the CG pair in the last two positions of the codon, which might be related to the evolution of the viruses. A remarkable level of variability of GC content in the third position of the codon among the seven coronaviruses was observed. Codons with high RSCU values are primarily from the aliphatic and hydroxyl amino acid groups, and codons with low RSCU values belong to the aliphatic, cyclic, positively charged, and sulfur-containing amino acid groups. In order to elucidate the evolutionary processes of the seven coronaviruses, a phylogenetic tree (dendrogram) was constructed based on the RSCU scores of the codons. The severe and mild categories CoVs were positioned in different clades. A comparative phylogenetic study with other coronaviruses depicted that SARS-CoV-2 is close to the CoV isolated from pangolins (Manis javanica, Pangolin-CoV) and cats (Felis catus, SARS(r)-CoV). Further analysis of the effective number of codon (ENC) usage bias showed a relatively higher bias for SARS-CoV and MERS-CoV compared to SARS-CoV-2. The ENC plot against GC3 suggested that the mutational bias might have a role in determining the codon usage variation among candidate viruses. A codon adaptability study on a few human host parasites (from different kingdoms), including CoVs, showed a diverse adaptability pattern. SARS-CoV-2 and SARS-CoV exhibit relatively lower but similar codon adaptability compared to MERS-CoV.


Subject(s)
COVID-19/genetics , Codon Usage/genetics , Evolution, Molecular , SARS-CoV-2/genetics , Base Composition/genetics , COVID-19/virology , Codon/genetics , Computational Biology , Genome, Viral/genetics , Humans , Nucleotides/genetics , Pandemics , SARS-CoV-2/pathogenicity
15.
Nat Commun ; 12(1): 2642, 2021 05 11.
Article in English | MEDLINE | ID: covidwho-1225505

ABSTRACT

Despite its clinical importance, the SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. We use comparative genomics to provide a high-confidence protein-coding gene set, characterize evolutionary constraint, and prioritize functional mutations. We select 44 Sarbecovirus genomes at ideally-suited evolutionary distances, and quantify protein-coding evolutionary signatures and overlapping constraint. We find strong protein-coding signatures for ORFs 3a, 6, 7a, 7b, 8, 9b, and a novel alternate-frame gene, ORF3c, whereas ORFs 2b, 3d/3d-2, 3b, 9c, and 10 lack protein-coding signatures or convincing experimental evidence of protein-coding function. Furthermore, we show no other conserved protein-coding genes remain to be discovered. Mutation analysis suggests ORF8 contributes to within-individual fitness but not person-to-person transmission. Cross-strain and within-strain evolutionary pressures agree, except for fewer-than-expected within-strain mutations in nsp3 and S1, and more-than-expected in nucleocapsid, which shows a cluster of mutations in a predicted B-cell epitope, suggesting immune-avoidance selection. Evolutionary histories of residues disrupted by spike-protein substitutions D614G, N501Y, E484K, and K417N/T provide clues about their biology, and we catalog likely-functional co-inherited mutations. Previously reported RNA-modification sites show no enrichment for conservation. Here we report a high-confidence gene set and evolutionary-history annotations providing valuable resources and insights on SARS-CoV-2 biology, mutations, and evolution.


Subject(s)
COVID-19/virology , Genome, Viral/genetics , Mutation , SARS-CoV-2/genetics , Betacoronavirus/classification , Betacoronavirus/genetics , Codon , Evolution, Molecular , Genes, Viral , Genetic Fitness , Genetic Variation , Open Reading Frames , Phylogeny , Spike Glycoprotein, Coronavirus/genetics , Viral Proteins/genetics
16.
Sheng Wu Gong Cheng Xue Bao ; 37(4): 1334-1345, 2021 Apr 25.
Article in Chinese | MEDLINE | ID: covidwho-1209675

ABSTRACT

The main protease (Mpro) of SARS-CoV-2 is a highly conserved and mutation-resistant coronaviral enzyme, which plays a pivotal role in viral replication, making it an ideal target for the development of novel broad-spectrum anti-coronaviral drugs. In this study, a codon-optimized Mpro gene was cloned into pET-21a and pET-28a expression vectors. The recombinant plasmids were transformed into E. coli Rosetta(DE3) competent cells and the expression conditions were optimized. The highly expressed recombinant proteins, Mpro and Mpro-28, were purified by HisTrapTM chelating column and its proteolytic activity was determined by a fluorescence resonance energy transfer (FRET) assay. The FRET assay showed that Mpro exhibits a desirable proteolytic activity (25 000 U/mg), with Km and kcat values of 11.68 µmol/L and 0.037/s, respectively. The specific activity of Mpro is 25 times that of Mpro-28, a fusion protein carrying a polyhistidine tag at the N and C termini, indicating additional residues at the N terminus of Mpro, but not at the C terminus, are detrimental to its proteolytic activity. The preparation of active SARS-CoV-2 Mpro through codon-optimization strategy might facilitate the development of the rapid screening assays for the discovery of broad-spectrum anti-coronaviral drugs targeting Mpro.


Subject(s)
COVID-19 , SARS-CoV-2 , Codon/genetics , Cysteine Endopeptidases/genetics , Escherichia coli/genetics , Humans , Peptide Hydrolases , Viral Nonstructural Proteins/genetics
17.
FEBS J ; 288(17): 5201-5223, 2021 09.
Article in English | MEDLINE | ID: covidwho-1146926

ABSTRACT

Circulating animal coronaviruses occasionally infect humans. The SARS-CoV-2 is responsible for the current worldwide outbreak of COVID-19 that has resulted in 2 112 844 deaths as of late January 2021. We compared genetic code preferences in 496 viruses, including 34 coronaviruses and 242 corresponding hosts, to uncover patterns that distinguish single- and 'promiscuous' multiple-host-infecting viruses. Based on a codon usage preference score, promiscuous viruses were shown to significantly employ nonoptimal codons, namely codons that involve 'wobble' binding to anticodons, as compared to single-host viruses. The codon adaptation index (CAI) and the effective number of codons (ENC) were calculated for all viruses and hosts. Promiscuous viruses were less adapted hosts vs single-host viruses (P-value = 4.392e-11). All coronaviruses exploit nonoptimal codons to infect multiple hosts. We found that nonoptimal codon preferences at the beginning of viral coding sequences enhance the translational efficiency of viral proteins within the host. Finally, coronaviruses lack endogenous RNA degradation motifs to a significant degree, thereby increasing viral mRNA burden and infection load. To conclude, we found that promiscuously infecting coronaviruses prefer nonoptimal codon usage to remove degradation motifs from their RNAs and to dramatically increase their viral RNA production rates.


Subject(s)
COVID-19/genetics , Codon Usage/genetics , Evolution, Molecular , SARS-CoV-2/genetics , Animals , COVID-19/virology , Codon/genetics , Computational Biology , Genetic Code/genetics , Genome, Viral/genetics , Humans , Phylogeny , RNA, Messenger/genetics , SARS-CoV-2/pathogenicity , Viral Proteins/genetics
18.
Viruses ; 13(3)2021 03 02.
Article in English | MEDLINE | ID: covidwho-1125910

ABSTRACT

Understanding SARS-CoV-2 evolution is a fundamental effort in coping with the COVID-19 pandemic. The virus genomes have been broadly evolving due to the high number of infected hosts world-wide. Mutagenesis and selection are two inter-dependent mechanisms of virus diversification. However, which mechanisms contribute to the mutation profiles of SARS-CoV-2 remain under-explored. Here, we delineate the contribution of mutagenesis and selection to the genome diversity of SARS-CoV-2 isolates. We generated a comprehensive phylogenetic tree with representative genomes. Instead of counting mutations relative to the reference genome, we identified each mutation event at the nodes of the phylogenetic tree. With this approach, we obtained the mutation events that are independent of each other and generated the mutation profile of SARS-CoV-2 genomes. The results suggest that the heterogeneous mutation patterns are mainly reflections of host (i) antiviral mechanisms that are achieved through APOBEC, ADAR, and ZAP proteins, and (ii) probable adaptation against reactive oxygen species.


Subject(s)
COVID-19/immunology , COVID-19/virology , Mutation , SARS-CoV-2/genetics , Base Sequence , COVID-19/genetics , Codon/genetics , Evolution, Molecular , Genome, Viral , Humans , Pandemics , Phylogeny , SARS-CoV-2/classification , SARS-CoV-2/immunology
19.
PLoS Biol ; 19(2): e3001091, 2021 02.
Article in English | MEDLINE | ID: covidwho-1102372

ABSTRACT

The recent emergence of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the underlying cause of Coronavirus Disease 2019 (COVID-19), has led to a worldwide pandemic causing substantial morbidity, mortality, and economic devastation. In response, many laboratories have redirected attention to SARS-CoV-2, meaning there is an urgent need for tools that can be used in laboratories unaccustomed to working with coronaviruses. Here we report a range of tools for SARS-CoV-2 research. First, we describe a facile single plasmid SARS-CoV-2 reverse genetics system that is simple to genetically manipulate and can be used to rescue infectious virus through transient transfection (without in vitro transcription or additional expression plasmids). The rescue system is accompanied by our panel of SARS-CoV-2 antibodies (against nearly every viral protein), SARS-CoV-2 clinical isolates, and SARS-CoV-2 permissive cell lines, which are all openly available to the scientific community. Using these tools, we demonstrate here that the controversial ORF10 protein is expressed in infected cells. Furthermore, we show that the promising repurposed antiviral activity of apilimod is dependent on TMPRSS2 expression. Altogether, our SARS-CoV-2 toolkit, which can be directly accessed via our website at https://mrcppu-covid.bio/, constitutes a resource with considerable potential to advance COVID-19 vaccine design, drug testing, and discovery science.


Subject(s)
COVID-19 Vaccines , COVID-19/diagnosis , COVID-19/virology , Reverse Genetics , SARS-CoV-2/genetics , A549 Cells , Angiotensin-Converting Enzyme 2/metabolism , Animals , Chlorocebus aethiops , Codon , Humans , Hydrazones/pharmacology , Mice , Morpholines/pharmacology , Open Reading Frames , Plasmids/genetics , Pyrimidines/pharmacology , Serine Endopeptidases/metabolism , Vero Cells , Viral Proteins/metabolism
20.
Sci Rep ; 11(1): 4108, 2021 02 18.
Article in English | MEDLINE | ID: covidwho-1091453

ABSTRACT

In December 2019, rising pneumonia cases caused by a novel ß-coronavirus (SARS-CoV-2) occurred in Wuhan, China, which has rapidly spread worldwide, causing thousands of deaths. The WHO declared the SARS-CoV-2 outbreak as a public health emergency of international concern, since then several scientists are dedicated to its study. It has been observed that many human viruses have codon usage biases that match highly expressed proteins in the tissues they infect and depend on the host cell machinery for the replication and co-evolution. In this work, we analysed 91 molecular features and codon usage patterns for 339 viral genes and 463 human genes that consisted of 677,873 codon positions. Hereby, we selected the highly expressed genes from human lung tissue to perform computational studies that permit to compare their molecular features with those of SARS, SARS-CoV-2 and MERS genes. The integrated analysis of all the features revealed that certain viral genes and overexpressed human genes have similar codon usage patterns. The main pattern was the A/T bias that together with other features could propitiate the viral infection, enhanced by a host dependant specialization of the translation machinery of only some of the overexpressed genes. The envelope protein E, the membrane glycoprotein M and ORF7 could be further benefited. This could be the key for a facilitated translation and viral replication conducting to different comorbidities depending on the genetic variability of population due to the host translation machinery. This is the first codon usage approach that reveals which human genes could be potentially deregulated due to the codon usage similarities between the host and the viral genes when the virus is already inside the human cells of the lung tissues. Our work leaded to the identification of additional highly expressed human genes which are not the usual suspects but might play a role in the viral infection and settle the basis for further research in the field of human genetics associated with new viral infections. To identify the genes that could be deregulated under a viral infection is important to predict the collateral effects and determine which individuals would be more susceptible based on their genetic features and comorbidities associated.


Subject(s)
Betacoronavirus/genetics , Coronavirus Infections/genetics , Coronavirus Infections/virology , Codon/genetics , Codon Usage , Computational Biology/methods , Coronavirus/genetics , Coronavirus Infections/metabolism , Genes, Viral , Genome, Viral , Humans , Middle East Respiratory Syndrome Coronavirus/genetics , Phylogeny , SARS Virus/genetics , SARS-CoV-2/genetics
SELECTION OF CITATIONS
SEARCH DETAIL