Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
1.
Hum Mol Genet ; 33(13): 1152-1163, 2024 Jun 21.
Article in English | MEDLINE | ID: mdl-38558123

ABSTRACT

Neanderthal and Denisovan hybridisation with modern humans has generated a non-random genomic distribution of introgressed regions, the result of drift and selection dynamics. Cross-species genomic incompatibility and more efficient removal of slightly deleterious archaic variants have been proposed as selection-based processes involved in the post-hybridisation purge of archaic introgressed regions. Both scenarios require the presence of functionally different alleles across Homo species onto which selection operated differently according to which populations hosted them, but only a few of these variants have been pinpointed so far. In order to identify functionally divergent archaic variants removed in humans, we focused on mitonuclear genes, which are underrepresented in the genomic landscape of archaic humans. We searched for non-synonymous, fixed, archaic-derived variants present in mitonuclear genes, rare or absent in human populations. We then compared the functional impact of archaic and human variants in the model organism Saccharomyces cerevisiae. Notably, a variant within the mitochondrial tyrosyl-tRNA synthetase 2 (YARS2) gene exhibited a significant decrease in respiratory activity and a substantial reduction of Cox2 levels, a proxy for mitochondrial protein biosynthesis, coupled with the accumulation of the YARS2 protein precursor and a lower amount of mature enzyme. Our work suggests that this variant is associated with mitochondrial functionality impairment, thus contributing to the purging of archaic introgression in YARS2. While different molecular mechanisms may have impacted other mitonuclear genes, our approach can be extended to the functional screening of mitonuclear genetic variants present across species and populations.


Subject(s)
Neanderthals , Saccharomyces cerevisiae , Humans , Saccharomyces cerevisiae/genetics , Neanderthals/genetics , Animals , Genetic Variation , Mitochondria/genetics , Mitochondria/metabolism , Alleles , Genetic Introgression , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism
2.
Microbiol Spectr ; 12(2): e0380223, 2024 Feb 06.
Article in English | MEDLINE | ID: mdl-38230940

ABSTRACT

Despite being first identified more than three decades ago, the antisense gene asp of HIV-1 remains an enigma. asp is present uniquely in pandemic (group M) HIV-1 strains, and it is absent in all non-pandemic (out-of-M) HIV-1 strains and virtually all non-human primate lentiviruses. This suggests that the creation of asp may have contributed to HIV-1 fitness or worldwide spread. It also raises the question of which evolutionary processes were at play in the creation of asp. Here, we show that HIV-1 genomes containing an intact asp gene are associated with faster HIV-1 disease progression. Furthermore, we demonstrate that the creation of a full-length asp gene occurred via the evolution of codon usage in env overlapping asp on the opposite strand. This involved differential use of synonymous codons or conservative amino acid substitution in env that eliminated internal stop codons in asp, and redistribution of synonymous codons in env that minimized the likelihood of new premature stops arising in asp. Nevertheless, the creation of a full-length asp gene reduced the genetic diversity of env. The Luria-Delbruck fluctuation test suggests that the interrupted asp open reading frame (ORF) is the progenitor of the intact ORF, rather than a descendant under random genetic drift. Therefore, the existence of group-M isolates with a truncated asp ORF indicates an incomplete transition process. For the first time, our study links the presence of a full-length asp ORF to faster disease progression, thus warranting further investigation into the cellular processes and molecular mechanisms through which the ASP protein impacts HIV-1 replication, transmission, and pathogenesis.IMPORTANCEOverlapping genes engage in a tug-of-war, constraining each other's evolution. The creation of a new gene overlapping an existing one comes at an evolutionary cost. Thus, its conservation must be advantageous, or it will be lost, especially if the pre-existing gene is essential for the viability of the virus or cell. We found that the creation and conservation of the HIV-1 antisense gene asp occurred through differential use of synonymous codons or conservative amino acid substitutions within the overlapping gene, env. This process did not involve amino acid changes in ENV that benefited its function, but rather it constrained the evolution of ENV. Nonetheless, the creation of asp brought a net selective advantage to HIV-1 because asp is conserved especially among high-prevalence strains. The association between the presence of an intact asp gene and faster HIV-1 disease progression supports that conclusion and warrants further investigation.


Subject(s)
HIV-1 , Animals , HIV-1/genetics , Pandemics , Codon , Open Reading Frames , Disease Progression
3.
Viruses ; 15(7)2023 07 20.
Article in English | MEDLINE | ID: mdl-37515266

ABSTRACT

A common feature of the mammalian Lentiviruses (family Retroviridae) is an RNA genome that contains an extremely high frequency of adenine (31.7-38.2%) while being extremely poor in cytosine (13.9-21.2%). Such a biased nucleotide composition has implications for codon usage, causing a striking difference between the frequency of synonymous codons in Lentiviruses and that in their hosts. To test whether primate Lentiviruses present differences in codon and amino acid composition, we assembled a dataset of genome sequences that includes SIV species infecting Old-World monkeys and African apes, HIV-2, and the four groups of HIV-1. Using principal component analysis, we found that HIV-1 shows a significant enrichment in adenine plus thymine in the third synonymous codon position and in adenine and guanine in the first and second nonsynonymous codon positions. Similarly, we observed an enrichment in adenine and in guanine in nonsynonymous first and second codon positions, which affects the amino acid composition of the proteins Gag, Pol, Vif, Vpr, Tat, Rev, Env, and Nef. This result suggests an effect of natural selection in shaping codon usage. Under the hypothesis that the use of synonyms in HIV-1 could reflect adaptation to that of genes expressed in specific cell types, we found a highly significant correlation between codon usage in HIV-1 and monocytes, which was remarkably higher than that with B and T lymphocytes. This finding is in line with the notion that monocytes represent an HIV-1 reservoir in infected patients, and it could help understand how this reservoir is established and maintained.


Subject(s)
HIV-1 , Lentiviruses, Primate , Animals , Amino Acids/genetics , Lentiviruses, Primate/genetics , Codon Usage , Codon , Lentivirus/genetics , HIV-1/genetics , Adenine , Guanine , Mammals
4.
Viruses ; 14(1)2022 01 14.
Article in English | MEDLINE | ID: mdl-35062351

ABSTRACT

Gene overprinting occurs when point mutations within a genomic region with an existing coding sequence create a new one in another reading frame. This process is quite frequent in viral genomes either to maximize the amount of information that they encode or in response to strong selective pressure. The most frequent scenario involves two different reading frames in the same DNA strand (sense overlap). Much less frequent are cases of overlapping genes that are encoded on opposite DNA strands (antisense overlap). One such example is the antisense ORF, asp in the minus strand of the HIV-1 genome overlapping the env gene. The asp gene is highly conserved in pandemic HIV-1 strains of group M, and it is absent in non-pandemic HIV-1 groups, HIV-2, and lentiviruses infecting non-human primates, suggesting that the ~190-amino acid protein that is expressed from this gene (ASP) may play a role in virus spread. While the function of ASP in the virus life cycle remains to be elucidated, mounting evidence from several research groups indicates that ASP is expressed in vivo. There are two alternative hypotheses that could be envisioned to explain the origin of the asp ORF. On one hand, asp may have originally been present in the ancestor of contemporary lentiviruses, and subsequently lost in all descendants except for most HIV-1 strains of group M due to selective advantage. Alternatively, the asp ORF may have originated very recently with the emergence of group M HIV-1 strains from SIVcpz. Here, we used a combination of computational and statistical approaches to study the genomic region of env in primate lentiviruses to shed light on the origin, structure, and sequence evolution of the asp ORF. The results emerging from our studies support the hypothesis of a recent de novo addition of the antisense ORF to the HIV-1 genome through a process that entailed progressive removal of existing internal stop codons from SIV strains to HIV-1 strains of group M, and fine tuning of the codon sequence in env that reduced the chances of new stop codons occurring in asp. Altogether, the study supports the notion that the HIV-1 asp gene encodes an accessory protein, providing a selective advantage to the virus.


Subject(s)
Genome, Viral , HIV-1/genetics , Human Immunodeficiency Virus Proteins/genetics , Open Reading Frames , Viral Envelope Proteins/genetics , Base Sequence , Codon , Evolution, Molecular , HIV Seropositivity/genetics , Humans
5.
Curr Opin Virol ; 52: 1-8, 2022 02.
Article in English | MEDLINE | ID: mdl-34798370

ABSTRACT

Viruses may evolve to increase the amount of encoded genetic information by means of overlapping genes, which utilize several reading frames. Such overlapping genes may be especially impactful for genomes of small size, often serving a source of novel accessory proteins, some of which play a crucial role in viral pathogenicity or in promoting the systemic spread of virus. Diverse genome-based metrics were proposed to facilitate recognition of overlapping genes that otherwise may be overlooked during genome annotation. They can detect the atypical codon bias associated with the overlap (e.g. a statistically significant reduction in variability at synonymous sites) or other sequence-composition features peculiar to overlapping genes. In this review, I compare nine computational methods, discuss their strengths and limitations, and survey how they were applied to detect candidate overlapping genes in the genome of SARS-CoV-2, the etiological agent of COVID-19 pandemic.


Subject(s)
COVID-19 , Pandemics , Computational Biology , Evolution, Molecular , Genes, Overlapping , Genome, Viral , Humans , Open Reading Frames , SARS-CoV-2
6.
Virology ; 562: 149-157, 2021 10.
Article in English | MEDLINE | ID: mdl-34339929

ABSTRACT

Six candidate overlapping genes have been detected in SARS-CoV-2, yet current methods struggle to detect overlapping genes that recently originated. However, such genes might encode proteins beneficial to the virus, and provide a model system to understand gene birth. To complement existing detection methods, I first demonstrated that selection pressure to avoid stop codons in alternative reading frames is a driving force in the origin and retention of overlapping genes. I then built a detection method, CodScr, based on this selection pressure. Finally, I combined CodScr with methods that detect other properties of overlapping genes, such as a biased nucleotide and amino acid composition. I detected two novel ORFs (ORF-Sh and ORF-Mh), overlapping the spike and membrane genes respectively, which are under selection pressure and may be beneficial to SARS-CoV-2. ORF-Sh and ORF-Mh are present, as ORF uninterrupted by stop codons, in 100% and 95% of the SARS-CoV-2 genomes, respectively.


Subject(s)
Codon Usage , Genes, Overlapping , Open Reading Frames , SARS-CoV-2/genetics , Evolution, Molecular , Genome, Viral , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/genetics , Statistics as Topic
7.
Genes (Basel) ; 12(6)2021 05 26.
Article in English | MEDLINE | ID: mdl-34073395

ABSTRACT

During their long evolutionary history viruses generated many proteins de novo by a mechanism called "overprinting". Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.


Subject(s)
Mutation Rate , Viruses/genetics , Genes, Viral , Phylogeny , Selection, Genetic , Viruses/classification
8.
Virology ; 558: 145-151, 2021 06.
Article in English | MEDLINE | ID: mdl-33774510

ABSTRACT

At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. We recommend calling the 39 codon Spike-overlapping ORF ORF2b; the 41, 57, and 22 codon ORF3a-overlapping ORFs ORF3c, ORF3d, and ORF3b; the 33 codon ORF3d isoform ORF3d-2; and the 97 and 73 codon Nucleocapsid-overlapping ORFs ORF9b and ORF9c. Finally, we document conflicting usage of the name ORF3b in 32 studies, and consequent erroneous inferences, stressing the importance of reserving identical names for homologs. We recommend that authors referring to these ORFs provide lengths and coordinates to minimize ambiguity caused by prior usage of alternative names.


Subject(s)
Open Reading Frames , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus , Terminology as Topic , SARS-CoV-2/immunology , Spike Glycoprotein, Coronavirus/classification , Spike Glycoprotein, Coronavirus/genetics
9.
Virology ; 546: 51-66, 2020 07.
Article in English | MEDLINE | ID: mdl-32452417

ABSTRACT

Overlapping genes originate by a mechanism of overprinting, in which nucleotide substitutions in a pre-existing frame induce the expression of a de novo protein from an alternative frame. In this study, I assembled a dataset of 319 viral overlapping genes, which included 82 overlaps whose expression is experimentally known and the respective 237 homologs. Principal component analysis revealed that overlapping genes have a common pattern of nucleotide and amino acid composition. Discriminant analysis separated overlapping from non-overlapping genes with an accuracy of 97%. When applied to overlapping genes with known genealogy, it separated ancestral from de novo frames with an accuracy close to 100%. This high discriminant power was crucial to computationally design variants of de novo viral proteins known to possess selective anticancer toxicity (apoptin) or protection against neurodegeneration (X protein), as well as to detect two new potential overlapping genes in the genome of the new coronavirus SARS-CoV-2.


Subject(s)
Betacoronavirus/genetics , Evolution, Molecular , Genes, Overlapping , Genes, Viral , Algorithms , Amino Acid Sequence , Base Sequence , Computational Biology , Computer Simulation , Discriminant Analysis , Least-Squares Analysis , Principal Component Analysis , SARS-CoV-2
10.
Virology ; 532: 39-47, 2019 06.
Article in English | MEDLINE | ID: mdl-31004987

ABSTRACT

Overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. Overlapping genes can undergo "symmetric evolution" (similar selection pressures on the two proteins) or "asymmetric evolution" (significantly different selection pressures on the two proteins). By sequence analysis of 75 pairs of homologous viral overlapping genes, I evaluated their accordance with one or the other model. Analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. Interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. These findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation.


Subject(s)
Evolution, Molecular , Genes, Overlapping , Genes, Viral , Viral Proteins/genetics , Viruses/genetics , Amino Acid Sequence , Base Sequence , Genetic Variation , Models, Genetic , Open Reading Frames , Phylogeny , Selection, Genetic , Viruses/classification
11.
PLoS One ; 13(10): e0202513, 2018.
Article in English | MEDLINE | ID: mdl-30339683

ABSTRACT

Overlapping genes represent a fascinating evolutionary puzzle, since they encode two functionally unrelated proteins from the same DNA sequence. They originate by a mechanism of overprinting, in which point mutations in an existing frame allow the expression (the "birth") of a completely new protein from a second frame. In viruses, in which overlapping genes are abundant, these new proteins often play a critical role in infection, yet they are frequently overlooked during genome annotation. This results in erroneous interpretation of mutational studies and in a significant waste of resources. Therefore, overlapping genes need to be correctly detected, especially since they are now thought to be abundant also in eukaryotes. Developing better detection methods and conducting systematic evolutionary studies require a large, reliable benchmark dataset of known cases. We thus assembled a high-quality dataset of 80 viral overlapping genes whose expression is experimentally proven. Many of them were not present in databases. We found that overall, overlapping genes differ significantly from non-overlapping genes in their nucleotide and amino acid composition. In particular, the proteins they encode are enriched in high-degeneracy amino acids and depleted in low-degeneracy ones, which may alleviate the evolutionary constraints acting on overlapping genes. Principal component analysis revealed that the vast majority of overlapping genes follow a similar composition bias, despite their heterogeneity in length and function. Six proven mammalian overlapping genes also followed this bias. We propose that this apparently near-universal composition bias may either favour the birth of overlapping genes, or/and result from selection pressure acting on them.


Subject(s)
Evolution, Molecular , Genes, Overlapping/genetics , Proteins/genetics , Amino Acid Sequence/genetics , Animals , Genes, Viral/genetics , Mammals/genetics , Mutation , Open Reading Frames/genetics , Principal Component Analysis
12.
J Gen Virol ; 96(12): 3577-3586, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26446206

ABSTRACT

The polymerase (P) and surface (S) genes of hepatitis B virus (HBV) show the longest gene overlap in animal viruses. Gene overlaps originate by the overprinting of a novel frame onto an ancestral pre-existing frame. Identifying which frame is ancestral and which frame is de novo (the genealogy of the overlap) is an appealing topic. However, the P/S overlap of HBV is an intriguing paradox, because both genes are indispensable for virus survival. Thus, the hypothesis of a primordial virus without the surface protein or without the polymerase makes no biological sense. With the aim to determine the genealogy of the overlap, the codon usage of the overlapping frames P and S was compared to that of the non-overlapping region. It was found that the overlap of human HBV had two patterns of codon usage. One was localized in the 59 one-third of the overlap and the other in the 39 two-thirds. By extending the analysis to non-human HBVs, it was found that this feature occurred in all hepadnaviruses. Under the assumption that the ancestral frame has a codon usage significantly closer to that of the non-overlapping region than the de novo frame, the ancestral frames in the 59 and 39 region of the overlap could be predicted. They were, respectively, frame S and frame P. These results suggest that the spacer domain of the polymerase and the S domain of the surface protein originated de novo by overprinting. They support a modular evolution hypothesis for the origin of the overlap.


Subject(s)
Evolution, Molecular , Gene Products, pol/metabolism , Hepatitis B Surface Antigens/metabolism , Hepatitis B virus/genetics , Hepatitis B virus/metabolism , Animals , Codon/physiology , Gene Products, pol/genetics , Genotype , Hepatitis B Surface Antigens/genetics , Hepatitis B virus/classification , Humans , Species Specificity
13.
J Theor Biol ; 357: 160-8, 2014 Sep 21.
Article in English | MEDLINE | ID: mdl-24853273

ABSTRACT

Little is known about the determinants of thermal stability in individual protein families. Most of the knowledge on thermostability comes, in fact, from comparative analyses between large, and heterogeneous, sets of thermo- and mesophilic proteins. Here, we present a multivariate statistical approach aimed to detect signature sequences for thermostability in a single protein family. It was applied to the glutamate dehydrogenase (GDH) family, which is a good model for investigating this peculiar process. The structure of GDH consists of six subunits, each of them organized into two domains. Formation of ion-pair networks on the surface of the protein subunits, or increase in the inter-subunit hydrophobic interactions, have been suggested as important factors for explaining stability at high temperatures. However, identification of the amino acid changes that are involved in this process still remains elusive. Our approach consisted of a linear discriminant analysis on a set of GDH sequences from Archaea and Bacteria (33 thermo- and 36 mesophilic GDHs). It led to detection of 3 amino acid clusters as the putative determinants of thermal stability. They were localized at the subunit interface or in close proximity to the binding site of the NAD(P)(+) coenzyme. Analysis within the clusters led to prediction of 8 critical amino acid sites. This approach could have a wide utility, in the ligth of the notion that each protein family seems to adopt its own strategy for achieving thermostability.


Subject(s)
Archaeal Proteins/chemistry , Bacterial Proteins/chemistry , Glutamate Dehydrogenase/chemistry , Hot Temperature , Models, Chemical , Binding Sites , Enzyme Stability , Protein Subunits/chemistry
14.
PLoS Comput Biol ; 9(8): e1003162, 2013.
Article in English | MEDLINE | ID: mdl-23966842

ABSTRACT

A well-known mechanism through which new protein-coding genes originate is by modification of pre-existing genes, e.g. by duplication or horizontal transfer. In contrast, many viruses generate protein-coding genes de novo, via the overprinting of a new reading frame onto an existing ("ancestral") frame. This mechanism is thought to play an important role in viral pathogenicity, but has been poorly explored, perhaps because identifying the de novo frames is very challenging. Therefore, a new approach to detect them was needed. We assembled a reference set of overlapping genes for which we could reliably determine the ancestral frames, and found that their codon usage was significantly closer to that of the rest of the viral genome than the codon usage of de novo frames. Based on this observation, we designed a method that allowed the identification of de novo frames based on their codon usage with a very good specificity, but intermediate sensitivity. Using our method, we predicted that the Rex gene of deltaretroviruses has originated de novo by overprinting the Tax gene. Intriguingly, several genes in the same genomic region have also originated de novo and encode proteins that regulate the functions of Tax. Such "gene nurseries" may be common in viral genomes. Finally, our results confirm that the genomic GC content is not the only determinant of codon usage in viruses and suggest that a constraint linked to translation must influence codon usage.


Subject(s)
Codon , Evolution, Molecular , Genome, Viral/genetics , Human T-lymphotropic virus 1/genetics , Viral Proteins/genetics , Genomics , Models, Genetic
15.
Microbiology (Reading) ; 156(Pt 11): 3243-3254, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20634238

ABSTRACT

Whole-genome sequencing efforts have revolutionized the study of bifidobacterial genetics and physiology. Unfortunately, the sequence of a single genome does not provide information on bifidobacterial genetic diversity and on how genetic variability supports improved adaptation of these bacteria to the environment of the human gastrointestinal tract (GIT). Analysis of nine genomes from bifidobacterial species showed that such genomes display an open pan-genome structure. Mathematical extrapolation of the data indicates that the genome reservoir available to the bifidobacterial pan-genome consists of more than 5000 genes, many of which are uncharacterized, but which are probably important to provide adaptive abilities pertinent to the human GIT. We also define a core bifidobacterial gene set which will undoubtedly provide a new baseline from which one can examine the evolution of bifidobacteria. Phylogenetic investigation performed on a total of 506 orthologues that are common to nine complete bifidobacterial genomes allowed the construction of a Bifidobacterium supertree which is largely concordant with the phylogenetic tree obtained using 16S rRNA genes. Moreover, this supertree provided a more robust phylogenetic resolution than the 16S rRNA gene-based analysis. This comparative study of the genus Bifidobacterium thus presents a foundation for future functional analyses of this important group of GIT bacteria.


Subject(s)
Bifidobacterium/genetics , Comparative Genomic Hybridization , Genome, Bacterial , Genomics , DNA, Bacterial/genetics , Gastrointestinal Tract/microbiology , Humans , Multivariate Analysis , Phylogeny , Proteome , RNA, Ribosomal, 16S/genetics , Sequence Alignment , Sequence Analysis, DNA
16.
Gene ; 402(1-2): 28-34, 2007 Nov 01.
Article in English | MEDLINE | ID: mdl-17825505

ABSTRACT

In viruses under strong pressure to minimize genome size, overlapping genes represent a fine strategy to condense a maximum amount of information into short nucleotide sequences. Here, we investigated the evolution of the genes encoding the nonstructural proteins NS1 and NS2 of influenza A virus (IAV), which are one of the best characterized cases of gene overlap. By a detailed analysis of about four hundred sequences grouped into 11 IAV subtypes, we found that the overlapping coding region of the NS1 gene shows a significant increase of the rate of nonsynonymous change, with respect to its nonoverlapping counterpart. The same feature was observed in the overlapping coding region of the NS2 gene. Such a variation pattern, which implies the occurrence of several amino acid substitutions in the protein regions encoded by overlapping frames, is different from the pattern of constrained evolution typical of other viral overlapping-gene systems. Amino acid sequence analysis of the NS1 and NS2 proteins revealed that some nonsynonymous substitutions, located in the region of gene overlap, play a critical role in shaping the genetic diversity of the highly pathogenic subtype H5N1. Since both proteins contribute to disease pathogenesis by affecting many virus and host-cell processes, information provided by this study should be useful to highlight the impact of nonstructural gene variation on the pathogenicity of H5N1 viruses.


Subject(s)
Amino Acid Substitution/genetics , Genes, Overlapping , Genetic Variation , Influenza A Virus, H5N1 Subtype/genetics , Viral Nonstructural Proteins/genetics , Evolution, Molecular , Open Reading Frames , Sequence Analysis, Protein , Viral Nonstructural Proteins/metabolism
17.
J Gen Virol ; 87(Pt 4): 1013-1017, 2006 Apr.
Article in English | MEDLINE | ID: mdl-16528052

ABSTRACT

The possibility of creating novel genes from pre-existing sequences, known as overprinting, is a widespread phenomenon in small viruses. Here, the origin and evolution of gene overlap in the bacteriophages belonging to the family Microviridae have been investigated. The distinction between ancestral and derived frames was carried out by comparing the patterns of codon usage in overlapping and non-overlapping genes. By this approach, a gradual increase in complexity of the phage genome--from an ancestral state lacking gene overlap to a derived state with a high density of genetic information--was inferred. Genes encoding less-essential proteins, yet playing a role in phage growth and diffusion, were predicted to be novel genes that originated by overprinting. Evaluation of the rates of synonymous and non-synonymous substitution yielded evidence for overlapping genes under positive selection in one frame and purifying selection in the alternative frame.


Subject(s)
Evolution, Molecular , Genes, Overlapping/genetics , Genes, Viral , Microviridae/genetics , Coliphages/genetics , Escherichia coli/virology , Selection, Genetic
18.
J Gen Virol ; 86(Pt 5): 1315-1326, 2005 May.
Article in English | MEDLINE | ID: mdl-15831942

ABSTRACT

JC virus (JCV) is a double-stranded DNA polyomavirus co-evolving with humans since the time of their origin in Africa. JCV seems to provide new insights into the history of human populations, as it suggests an expansion of humans from Africa via two distinct migrations, each carrying a different lineage of the virus. A possible alternative to this interpretation could be that the divergence between the two lineages is due to selective pressures favouring adaptation of JCV to different climates, thus making any inference about human history debatable. In the present study, the evolution of JCV was investigated by applying correspondence analysis to a set of 273 fully sequenced strains. The first and more important axis of ordination led to the detection of 61 nt positions as the main determinants of the divergence between the two virus lineages. One lineage includes strains of types 1 and 4, the other strains of types 2, 3, 7 and 8. The distinctiveness of the Caucasian lineage (types 1 and 4), largely diffused in the northern areas of the world, was almost entirely ascribed to synonymous substitutions. The findings provided by the subsequent axes of ordination supported the view of an evolutionary history of JCV characterized by genetic drift and migration, rather than by natural selection. Correspondence analysis was also applied to a set of 156 human mitochondrial genome sequences. A detailed comparison between the substitution patterns in JCV and mitochondria brought to light some relevant advantages of the use of the virus in tracing human migrations.


Subject(s)
DNA, Mitochondrial/genetics , DNA, Viral/genetics , Emigration and Immigration/history , Genetic Drift , JC Virus/genetics , Biological Evolution , History, Ancient , Humans , JC Virus/isolation & purification , Mitochondria/genetics , Polyomavirus Infections/virology , Selection, Genetic , Sequence Homology, Nucleic Acid
19.
J Mol Evol ; 58(3): 304-13, 2004 Mar.
Article in English | MEDLINE | ID: mdl-15045485

ABSTRACT

The polyomavirus JC (JCV) is a double-stranded DNA virus that is ubiquitous in human populations and is excreted in urine by a large percentage of individuals (20-70%). The strong genetic stability, combined with a mechanism of transmission mainly within the family, makes JCV a good marker of human migrations. In this study, the coevolution of JCV with its human host is investigated by using over a thousand nucleotide sequences deposited in the EMBL database; they correspond to the IG region, which is the genomic region with the highest rate of variation. The pattern of genetic diversity in JCV is evaluated by the principal coordinates analysis and the construction of synthetic maps. The first principal coordinate supports the existence of two distinct virus lineages, both arising from the ancestral African type. The first synthetic map suggests a two-migration model of the human dispersal out of Africa, thus implying a more complex picture than that known from human genes. The second principal coordinate points out the distinctiveness of strains coming from Asian/Amerind populations. The picture yielded by the second synthetic map appears to be more consistent with that known from human genes. In fact, it provides evidence of a deep split of the Asian lineage of JCV into two main branches: one diffusing in Japan and Americas, the other in Southeast Asia. The view that JCV, with its peculiar feature of a dual early emergence from Africa, can provide new information about the evolutionary history of our ancestors is discussed.


Subject(s)
Demography , Emigration and Immigration/history , Evolution, Molecular , Genetic Variation , JC Virus/genetics , Databases, Nucleic Acid , Geographic Information Systems , Geography , History, Ancient , Humans , Population Dynamics , Principal Component Analysis , Sequence Alignment
20.
J Mol Evol ; 56(5): 564-72, 2003 May.
Article in English | MEDLINE | ID: mdl-12698293

ABSTRACT

The presence of distinctive types of JC virus (JCV) in the main ethnic groups suggests a close coevolution with the human host. However, phylogenetic trees of JCV show a basal clade of European lineages (Types 1/4), whereas trees of human genes are coherent in placing the first split between African and non-African populations. This discrepancy places into question the effectiveness of JCV as a marker of human population history. The present study investigates the evolution of JCV using a large set of fully sequenced strains. Their relationships are first elucidated by principal coordinates analysis. It is suggested that Type 6 from West Africa could represent the ancestral type, while the peculiar phylogeny of Types 1/4 could reflect their direct origin from the ancestral lineage. Further credit to the African origin of JCV is provided by a neighbor-joining analysis based on slow-evolving sites. Sequence analysis of fast-evolving sites reveals that the deep emergence of Types 1/4 in the tree does not reflect a real evolutionary divergence; rather it is the implicit result of a remarkably different G + C content. The hypothesis that Types 1/4 originated directly from Type 6 is confirmed by examining the pattern of variation at a few specific fast-evolving sites. On the basis of this approach, a twofold exit of JCV from Africa is hypothesized: one in the direction of the Eurasian continent and another limited to Europe. These findings suggest that two distinct migrations of individuals played a key role in the peopling of Europe during prehistoric times.


Subject(s)
Evolution, Molecular , JC Virus/genetics , Population Dynamics , Africa , Animals , Archaeology , Base Sequence , Genetics, Population , Genotype , Humans , Molecular Sequence Data , Phylogeny , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL
...