Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 40
Filter
Add more filters










Publication year range
1.
BMC Bioinformatics ; 25(1): 151, 2024 Apr 16.
Article in English | MEDLINE | ID: mdl-38627634

ABSTRACT

BACKGROUND: Genomes are inherently inhomogeneous, with features such as base composition, recombination, gene density, and gene expression varying along chromosomes. Evolutionary, biological, and biomedical analyses aim to quantify this variation, account for it during inference procedures, and ultimately determine the causal processes behind it. Since sequential observations along chromosomes are not independent, it is unsurprising that autocorrelation patterns have been observed e.g., in human base composition. In this article, we develop a class of Hidden Markov Models (HMMs) called oHMMed (ordered HMM with emission densities, the corresponding R package of the same name is available on CRAN): They identify the number of comparably homogeneous regions within autocorrelated observed sequences. These are modelled as discrete hidden states; the observed data points are realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are inferred. RESULTS: We apply our oHMMed algorithms to the proportion of G and C bases (modelled as a mixture of normal distributions) and the number of genes (modelled as a mixture of poisson-gamma distributions) in windows along the human, mouse, and fruit fly genomes. This results in a partitioning of the genomes into regions by statistically distinguishable averages of these features, and in a characterisation of their continuous patterns of variation. In regard to the genomic G and C proportion, this latter result distinguishes oHMMed from segmentation algorithms based in isochore or compositional domain theory. We further use oHMMed to conduct a detailed analysis of variation of chromatin accessibility (ATAC-seq) and epigenetic markers H3K27ac and H3K27me3 (modelled as a mixture of poisson-gamma distributions) along the human chromosome 1 and their correlations. CONCLUSIONS: Our algorithms provide a biologically assumption free approach to characterising genomic landscapes shaped by continuous, autocorrelated patterns of variation. Despite this, the resulting genome segmentation enables extraction of compositionally distinct regions for further downstream analyses.


Subject(s)
Genome , Genomics , Animals , Humans , Mice , Markov Chains , Base Composition , Probability , Algorithms
2.
Mol Biol Evol ; 41(5)2024 May 03.
Article in English | MEDLINE | ID: mdl-38667829

ABSTRACT

Different frequencies amongst codons that encode the same amino acid (i.e. synonymous codons) have been observed in multiple species. Studies focused on uncovering the forces that drive such codon usage showed that a combined effect of mutational biases and translational selection works to produce different frequencies of synonymous codons. However, only few have been able to measure and distinguish between these forces that may leave similar traces on the coding regions. Here, we have developed a codon model that allows the disentangling of mutation, selection on amino acids and synonymous codons, and GC-biased gene conversion (gBGC) which we employed on an extensive dataset of 415 chordates and 191 arthropods. We found that chordates need 15 more synonymous codon categories than arthropods to explain the empirical codon frequencies, which suggests that the extent of codon usage can vary greatly between animal phyla. Moreover, methylation at CpG sites seems to partially explain these patterns of codon usage in chordates but not in arthropods. Despite the differences between the two phyla, our findings demonstrate that in both, GC-rich codons are disfavored when mutations are GC-biased, and the opposite is true when mutations are AT-biased. This indicates that selection on the genomic coding regions might act primarily to stabilize its GC/AT content on a genome-wide level. Our study shows that the degree of synonymous codon usage varies considerably among animals, but is likely governed by a common underlying dynamic.


Subject(s)
Arthropods , Codon Usage , Selection, Genetic , Animals , Arthropods/genetics , Chordata/genetics , Mutation , Evolution, Molecular , Codon , Models, Genetic , Base Composition , Gene Conversion
3.
Genome Biol Evol ; 15(7)2023 07 03.
Article in English | MEDLINE | ID: mdl-37341535

ABSTRACT

Experimental evolution studies are powerful approaches to examine the evolutionary history of lab populations. Such studies have shed light on how selection changes phenotypes and genotypes. Most of these studies have not examined the time course of adaptation under sexual selection manipulation, by resequencing the populations' genomes at multiple time points. Here, we analyze allele frequency trajectories in Drosophila pseudoobscura where we altered their sexual selection regime for 200 generations and sequenced pooled populations at 5 time points. The intensity of sexual selection was either relaxed in monogamous populations (M) or elevated in polyandrous lines (E). We present a comprehensive study of how selection alters population genetics parameters at the chromosome and gene level. We investigate differences in the effective population size-Ne-between the treatments, and perform a genome-wide scan to identify signatures of selection from the time-series data. We found genomic signatures of adaptation to both regimes in D. pseudoobscura. There are more significant variants in E lines as expected from stronger sexual selection. However, we found that the response on the X chromosome was substantial in both treatments, more pronounced in E and restricted to the more recently sex-linked chromosome arm XR in M. In the first generations of experimental evolution, we estimate Ne to be lower on the X in E lines, which might indicate a swift adaptive response at the onset of selection. Additionally, the third chromosome was affected by elevated polyandry whereby its distal end harbors a region showing a strong signal of adaptive evolution especially in E lines.


Subject(s)
Drosophila , Sexual Selection , Animals , Drosophila/genetics , Gene Frequency , Genetics, Population , Adaptation, Physiological/genetics , Selection, Genetic , Biological Evolution
4.
J Evol Biol ; 36(1): 29-44, 2023 01.
Article in English | MEDLINE | ID: mdl-36544394

ABSTRACT

For over a decade, experimental evolution has been combined with high-throughput sequencing techniques. In so-called Evolve-and-Resequence (E&R) experiments, populations are kept in the laboratory under controlled experimental conditions where their genomes are sampled and allele frequencies monitored. However, identifying signatures of adaptation in E&R datasets is far from trivial, and it is still necessary to develop more efficient and statistically sound methods for detecting selection in genome-wide data. Here, we present Bait-ER - a fully Bayesian approach based on the Moran model of allele evolution to estimate selection coefficients from E&R experiments. The model has overlapping generations, a feature that describes several experimental designs found in the literature. We tested our method under several different demographic and experimental conditions to assess its accuracy and precision, and it performs well in most scenarios. Nevertheless, some care must be taken when analysing trajectories where drift largely dominates and starting frequencies are low. We compare our method with other available software and report that ours has generally high accuracy even for trajectories whose complexity goes beyond a classical sweep model. Furthermore, our approach avoids the computational burden of simulating an empirical null distribution, outperforming available software in terms of computational time and facilitating its use on genome-wide data. We implemented and released our method in a new open-source software package that can be accessed at https://doi.org/10.5281/zenodo.7351736.


Subject(s)
Selection, Genetic , Software , Bayes Theorem , Gene Frequency , Adaptation, Physiological
5.
Genome Biol Evol ; 14(1)2022 01 04.
Article in English | MEDLINE | ID: mdl-34983052

ABSTRACT

Despite the importance of natural selection in species' evolutionary history, phylogenetic methods that take into account population-level processes typically ignore selection. The assumption of neutrality is often based on the idea that selection occurs at a minority of loci in the genome and is unlikely to compromise phylogenetic inferences significantly. However, genome-wide processes like GC-bias and some variation segregating at the coding regions are known to evolve in the nearly neutral range. As we are now using genome-wide data to estimate species trees, it is natural to ask whether weak but pervasive selection is likely to blur species tree inferences. We developed a polymorphism-aware phylogenetic model tailored for measuring signatures of nucleotide usage biases to test the impact of selection in the species tree. Our analyses indicate that although the inferred relationships among species are not significantly compromised, the genetic distances are systematically underestimated in a node-height-dependent manner: that is, the deeper nodes tend to be more underestimated than the shallow ones. Such biases have implications for molecular dating. We dated the evolutionary history of 30 worldwide fruit fly populations, and we found signatures of GC-bias considerably affecting the estimated divergence times (up to 23%) in the neutral model. Our findings call for the need to account for selection when quantifying divergence or dating species evolution.


Subject(s)
Codon Usage , Evolution, Molecular , Animals , Codon Usage/genetics , Drosophila , Nucleotides , Phylogeny , Selection, Genetic
6.
J Theor Biol ; 486: 110074, 2020 02 07.
Article in English | MEDLINE | ID: mdl-31711991

ABSTRACT

Polymorphism-aware phylogenetic models (PoMo) constitute an alternative approach for species tree estimation from genome-wide data. PoMo builds on the standard substitution models of DNA evolution but expands the classic alphabet of the four nucleotide bases to include polymorphic states. By doing so, PoMo accounts for ancestral and current intra-population variation, while also accommodating population-level processes ruling the substitution process (e.g. genetic drift, mutations, allelic selection). PoMo has shown to be a valuable tool in several phylogenetic applications but a proof of statistical consistency (and identifiability, a necessary condition for consistency) is lacking. Here, we prove that PoMo is identifiable and, using this result, we further show that the maximum a posteriori (MAP) tree estimator of PoMo is a consistent estimator of the species tree. We complement our theoretical results with a simulated data set mimicking the diversity observed in natural populations exhibiting incomplete lineage sorting. We implemented PoMo in a Bayesian framework and show that the MAP tree easily recovers the true tree for typical numbers of sites that are sampled in genome-wide analyses.


Subject(s)
Genome-Wide Association Study , Models, Genetic , Bayes Theorem , Evolution, Molecular , Phylogeny , Polymorphism, Genetic
7.
Methods Mol Biol ; 1910: 373-397, 2019.
Article in English | MEDLINE | ID: mdl-31278671

ABSTRACT

Populations evolve as mutations arise in individual organisms and, through hereditary transmission, may become "fixed" (shared by all individuals) in the population. Most mutations are lethal or have negative fitness consequences for the organism. Others have essentially no effect on organismal fitness and can become fixed through the neutral stochastic process known as random drift. However, mutations may also produce a selective advantage that boosts their chances of reaching fixation. Regions of genomes where new mutations are beneficial, rather than neutral or deleterious, tend to evolve more rapidly due to positive selection. Genes involved in immunity and defense are a well-known example; rapid evolution in these genes presumably occurs because new mutations help organisms to prevail in evolutionary "arms races" with pathogens. In recent years genome-wide scans for selection have enlarged our understanding of the genome evolution of various species. In this chapter, we will focus on methods to detect selection on the genome. In particular, we will discuss probabilistic models and how they have changed with the advent of new genome-wide data now available.


Subject(s)
Evolution, Molecular , Genome , Selection, Genetic , Animals , Codon , Computational Biology/methods , Genome-Wide Association Study , Genomics/methods , Genotype , Humans , Mammals/genetics , Models, Genetic , Models, Statistical , Mutation , Phylogeny , Polymorphism, Genetic , Software , Web Browser
8.
Genetics ; 212(4): 1321-1336, 2019 08.
Article in English | MEDLINE | ID: mdl-31147380

ABSTRACT

As multi-individual population-scale data become available, more complex modeling strategies are needed to quantify genome-wide patterns of nucleotide usage and associated mechanisms of evolution. Recently, the multivariate neutral Moran model was proposed. However, it was shown insufficient to explain the distribution of alleles in great apes. Here, we propose a new model that includes allelic selection. Our theoretical results constitute the basis of a new Bayesian framework to estimate mutation rates and selection coefficients from population data. We apply the new framework to a great ape dataset, where we found patterns of allelic selection that match those of genome-wide GC-biased gene conversion (gBGC). In particular, we show that great apes have patterns of allelic selection that vary in intensity-a feature that we correlated with great apes' distinct demographies. We also demonstrate that the AT/GC toggling effect decreases the probability of a substitution, promoting more polymorphisms in the base composition of great ape genomes. We further assess the impact of GC-bias in molecular analysis, and find that mutation rates and genetic distances are estimated under bias when gBGC is not properly accounted for. Our results contribute to the discussion on the tempo and mode of gBGC evolution, while stressing the need for gBGC-aware models in population genetics and phylogenetics.


Subject(s)
Gene Conversion , Hominidae/genetics , Models, Genetic , Animals , GC Rich Sequence , Genome , Polymorphism, Genetic
9.
Mol Biol Evol ; 36(6): 1294-1301, 2019 06 01.
Article in English | MEDLINE | ID: mdl-30825307

ABSTRACT

Molecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.


Subject(s)
Models, Genetic , Mutation Rate , Phylogeny , Polymorphism, Genetic , Animals , Humans , Likelihood Functions , Software
10.
Sci Adv ; 5(1): eaau6947, 2019 01.
Article in English | MEDLINE | ID: mdl-30854422

ABSTRACT

Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of complex evolutionary divergence, hybridization, and reticulation. These results help inform our understanding of similar cases, including modern humans, Neanderthals, Denisovans, and other ancient hominins.


Subject(s)
Biological Evolution , Genomics/methods , Papio/genetics , Animals , Base Sequence , Female , Gene Flow , Haplotypes/genetics , Humans , Hybridization, Genetic , Male , Phylogeny , Polymorphism, Genetic , Whole Genome Sequencing
11.
J Theor Biol ; 439: 166-180, 2018 02 14.
Article in English | MEDLINE | ID: mdl-29229523

ABSTRACT

A central aim of population genetics is the inference of the evolutionary history of a population. To this end, the underlying process can be represented by a model of the evolution of allele frequencies parametrized by e.g., the population size, mutation rates and selection coefficients. A large class of models use forward-in-time models, such as the discrete Wright-Fisher and Moran models and the continuous forward diffusion, to obtain distributions of population allele frequencies, conditional on an ancestral initial allele frequency distribution. Backward-in-time diffusion processes have been rarely used in the context of parameter inference. Here, we demonstrate how forward and backward diffusion processes can be combined to efficiently calculate the exact joint probability distribution of sample and population allele frequencies at all times in the past, for both discrete and continuous population genetics models. This procedure is analogous to the forward-backward algorithm of hidden Markov models. While the efficiency of discrete models is limited by the population size, for continuous models it suffices to expand the transition density in orthogonal polynomials of the order of the sample size to infer marginal likelihoods of population genetic parameters. Additionally, conditional allele trajectories and marginal likelihoods of samples from single populations or from multiple populations that split in the past can be obtained. The described approaches allow for efficient maximum likelihood inference of population genetic parameters in a wide variety of demographic scenarios.


Subject(s)
Genetics, Population/methods , Models, Genetic , Algorithms , Biological Evolution , Gene Frequency , Likelihood Functions , Markov Chains , Methods , Population Density , Time
12.
Stat Appl Genet Mol Biol ; 16(5-6): 387-405, 2017 11 27.
Article in English | MEDLINE | ID: mdl-29095700

ABSTRACT

In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC) can be inefficient in high-dimensional problems. This led to the development of more sophisticated iterative estimation methods like particle filters. Here, we propose an alternative approach that is based on stochastic approximation. By moving along a simulated gradient or ascent direction, the algorithm produces a sequence of estimates that eventually converges to the maximum likelihood estimate, given a set of observed summary statistics. This strategy does not sample much from low-likelihood regions of the parameter space, and is fast, even when many summary statistics are involved. We put considerable efforts into providing tuning guidelines that improve the robustness and lead to good performance on problems with high-dimensional summary statistics and a low signal-to-noise ratio. We then investigate the performance of our resulting approach and study its properties in simulations. Finally, we re-estimate parameters describing the demographic history of Bornean and Sumatran orang-utans.


Subject(s)
Genetics, Population/methods , Likelihood Functions , Models, Genetic , Algorithms , Bayes Theorem , Computer Simulation , Evolution, Molecular
13.
Mol Ecol ; 26(14): 3649-3662, 2017 Jul.
Article in English | MEDLINE | ID: mdl-28370647

ABSTRACT

The orchid family is the largest in the angiosperms, but little is known about the molecular basis of the significant variation they exhibit. We investigate here the transcriptomic divergence between two European terrestrial orchids, Dactylorhiza incarnata and Dactylorhiza fuchsii, and integrate these results in the context of their distinct ecologies that we also document. Clear signals of lineage-specific adaptive evolution of protein-coding sequences are identified, notably targeting elements of biotic defence, including both physical and chemical adaptations in the context of divergent pools of pathogens and herbivores. In turn, a substantial regulatory divergence between the two species appears linked to adaptation/acclimation to abiotic conditions. Several of the pathways affected by differential expression are also targeted by deviating post-transcriptional regulation via sRNAs. Finally, D. incarnata appears to suffer from insufficient sRNA control over the activity of RNA-dependent DNA polymerase, resulting in increased activity of class I transposable elements and, over time, in larger genome size than that of D. fuchsii. The extensive molecular divergence between the two species suggests significant genomic and transcriptomic shock in their hybrids and offers insights into the difficulty of coexistence at the homoploid level. Altogether, biological response to selection, accumulated during the history of these orchids, appears governed by their microenvironmental context, in which biotic and abiotic pressures act synergistically to shape transcriptome structure, expression and regulation.


Subject(s)
Adaptation, Biological/genetics , Biological Evolution , Orchidaceae/classification , Transcriptome , DNA Transposable Elements , Ecology , Environment , Genome, Plant , Genomics
14.
Genetics ; 204(2): 723-735, 2016 Oct.
Article in English | MEDLINE | ID: mdl-27542959

ABSTRACT

The effective population size ([Formula: see text]) is a major factor determining allele frequency changes in natural and experimental populations. Temporal methods provide a powerful and simple approach to estimate short-term [Formula: see text] They use allele frequency shifts between temporal samples to calculate the standardized variance, which is directly related to [Formula: see text] Here we focus on experimental evolution studies that often rely on repeated sequencing of samples in pools (Pool-seq). Pool-seq is cost-effective and often outperforms individual-based sequencing in estimating allele frequencies, but it is associated with atypical sampling properties: Additional to sampling individuals, sequencing DNA in pools leads to a second round of sampling, which increases the variance of allele frequency estimates. We propose a new estimator of [Formula: see text] which relies on allele frequency changes in temporal data and corrects for the variance in both sampling steps. In simulations, we obtain accurate [Formula: see text] estimates, as long as the drift variance is not too small compared to the sampling and sequencing variance. In addition to genome-wide [Formula: see text] estimates, we extend our method using a recursive partitioning approach to estimate [Formula: see text] locally along the chromosome. Since the type I error is controlled, our method permits the identification of genomic regions that differ significantly in their [Formula: see text] estimates. We present an application to Pool-seq data from experimental evolution with Drosophila and provide recommendations for whole-genome data. The estimator is computationally efficient and available as an R package at https://github.com/ThomasTaus/Nest.


Subject(s)
Directed Molecular Evolution , Gene Frequency/genetics , Population Density , Sequence Analysis, DNA , Alleles , Animals , Drosophila/genetics , Polymorphism, Single Nucleotide/genetics
15.
J Theor Biol ; 407: 362-370, 2016 10 21.
Article in English | MEDLINE | ID: mdl-27480613

ABSTRACT

We present a reversible Polymorphism-Aware Phylogenetic Model (revPoMo) for species tree estimation from genome-wide data. revPoMo enables the reconstruction of large scale species trees for many within-species samples. It expands the alphabet of DNA substitution models to include polymorphic states, thereby, naturally accounting for incomplete lineage sorting. We implemented revPoMo in the maximum likelihood software IQ-TREE. A simulation study and an application to great apes data show that the runtimes of our approach and standard substitution models are comparable but that revPoMo has much better accuracy in estimating trees, divergence times and mutation rates. The advantage of revPoMo is that an increase of sample size per species improves estimations but does not increase runtime. Therefore, revPoMo is a valuable tool with several applications, from speciation dating to species tree reconstruction.


Subject(s)
Models, Genetic , Phylogeny , Polymorphism, Genetic , Animals , Computer Simulation , Diffusion , Hominidae/genetics , Species Specificity
16.
J Gen Virol ; 97(9): 2323-2332, 2016 09.
Article in English | MEDLINE | ID: mdl-27267884

ABSTRACT

Complete genomes of eight reference strains representing different serotypes within the species Fowl aviadenovirus D (FAdV-D) and Fowl aviadenovirus E (FAdV-E) were sequenced. The sequenced genomes of FAdV-D and FAdV-E members comprise 43 287 to 44 336 bp, and have a gene organization identical to that of an earlier sequenced FAdV-D member (strain A-2A). Highest diversity was noticed in the hexon and fiber genes and ORF19. All genomes sequenced in this study contain one fiber gene. Phylogenetic analyses and G+C content support the division of the genus Aviadenovirus into the currently recognized species. Our data also suggest that strain SR48 should be considered as FAdV-11 instead of FAdV-2 and similarly strain HG as FAdV-8b. The present results complete the list of genome sequences of reference strains representing all serotypes in species FAdV-D and FAdV-E.


Subject(s)
Aviadenovirus/classification , Aviadenovirus/genetics , Genetic Variation , Base Composition , Capsid Proteins/genetics , Cluster Analysis , DNA, Viral/chemistry , DNA, Viral/genetics , Gene Order , Genome, Viral , Phylogeny , Sequence Analysis, DNA , Sequence Homology
17.
Syst Biol ; 64(6): 1018-31, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26209413

ABSTRACT

Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches.


Subject(s)
Classification/methods , Computer Simulation , Gene Frequency , Phylogeny , Animals , Hominidae/classification , Hominidae/genetics , Mutation , Polymorphism, Genetic
18.
Bioinformatics ; 31(11): 1762-70, 2015 Jun 01.
Article in English | MEDLINE | ID: mdl-25614471

ABSTRACT

MOTIVATION: Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but also monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analyzing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth. RESULTS: We present the beta-binomial Gaussian process model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine it with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present simulations exploring different experimental design choices and results on real data from Drosophila experimental evolution experiment in temperature adaptation. AVAILABILITY AND IMPLEMENTATION: R software implementing the test is available at https://github.com/handetopa/BBGP.


Subject(s)
Evolution, Molecular , Gene Frequency , High-Throughput Nucleotide Sequencing/methods , Alleles , Animals , Drosophila/genetics , Genomics/methods , Models, Statistical , Normal Distribution , Polymorphism, Single Nucleotide , Software
19.
Virology ; 462-463: 107-14, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24971703

ABSTRACT

Complete genomes of the first isolates of pigeon adenovirus 1 (PiAdV-1) and Muscovy duck adenovirus (duck adenovirus 2, DAdV-2) were sequenced. The PiAdV-1 genome is 45,480bp long, and has a gene organization most similar to turkey adenovirus 1. Near the left end of the genome, it lacks ORF0, ORF1A, ORF1B and ORF1C, and possesses ORF52, whereas six novel genes were found near the right end. The DAdV-2 genome is 43,734bp long, and has a gene organization similar to that of goose adenovirus 4 (GoAdV-4). It lacks ORF51, ORF1C and ORF54, and possesses ORF55A and five other novel genes. PiAdV-1 and DAdV-2 genomes contain two and one fiber genes, respectively. Genome organization, G+C content, molecular phylogeny and host type confirm the need to establish two novel species (Pigeon aviadenovirus A and Duck aviadenovirus B) within the genus Aviadenovirus. Phylogenetic data show that DAdV-2 is most closely related to GoAdV-4.


Subject(s)
Aviadenovirus/genetics , DNA, Viral/chemistry , DNA, Viral/genetics , Genome, Viral , Animals , Aviadenovirus/isolation & purification , Base Composition , Cluster Analysis , Columbidae , Ducks , Gene Order , Molecular Sequence Data , Open Reading Frames , Phylogeny , Sequence Analysis, DNA , Synteny
20.
Elife ; 3: e01311, 2014 Feb 19.
Article in English | MEDLINE | ID: mdl-24554240

ABSTRACT

Orphans are genes restricted to a single phylogenetic lineage and emerge at high rates. While this predicts an accumulation of genes, the gene number has remained remarkably constant through evolution. This paradox has not yet been resolved. Because orphan genes have been mainly analyzed over long evolutionary time scales, orphan loss has remained unexplored. Here we study the patterns of orphan turnover among close relatives in the Drosophila obscura group. We show that orphans are not only emerging at a high rate, but that they are also rapidly lost. Interestingly, recently emerged orphans are more likely to be lost than older ones. Furthermore, highly expressed orphans with a strong male-bias are more likely to be retained. Since both lost and retained orphans show similar evolutionary signatures of functional conservation, we propose that orphan loss is not driven by high rates of sequence evolution, but reflects lineage-specific functional requirements. DOI: http://dx.doi.org/10.7554/eLife.01311.001.


Subject(s)
Drosophila/genetics , Gene Deletion , Genes, Insect , Animals , Evolution, Molecular , Female , Male
SELECTION OF CITATIONS
SEARCH DETAIL
...