Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
BMC Bioinformatics ; 25(1): 151, 2024 Apr 16.
Article in English | MEDLINE | ID: mdl-38627634

ABSTRACT

BACKGROUND: Genomes are inherently inhomogeneous, with features such as base composition, recombination, gene density, and gene expression varying along chromosomes. Evolutionary, biological, and biomedical analyses aim to quantify this variation, account for it during inference procedures, and ultimately determine the causal processes behind it. Since sequential observations along chromosomes are not independent, it is unsurprising that autocorrelation patterns have been observed e.g., in human base composition. In this article, we develop a class of Hidden Markov Models (HMMs) called oHMMed (ordered HMM with emission densities, the corresponding R package of the same name is available on CRAN): They identify the number of comparably homogeneous regions within autocorrelated observed sequences. These are modelled as discrete hidden states; the observed data points are realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are inferred. RESULTS: We apply our oHMMed algorithms to the proportion of G and C bases (modelled as a mixture of normal distributions) and the number of genes (modelled as a mixture of poisson-gamma distributions) in windows along the human, mouse, and fruit fly genomes. This results in a partitioning of the genomes into regions by statistically distinguishable averages of these features, and in a characterisation of their continuous patterns of variation. In regard to the genomic G and C proportion, this latter result distinguishes oHMMed from segmentation algorithms based in isochore or compositional domain theory. We further use oHMMed to conduct a detailed analysis of variation of chromatin accessibility (ATAC-seq) and epigenetic markers H3K27ac and H3K27me3 (modelled as a mixture of poisson-gamma distributions) along the human chromosome 1 and their correlations. CONCLUSIONS: Our algorithms provide a biologically assumption free approach to characterising genomic landscapes shaped by continuous, autocorrelated patterns of variation. Despite this, the resulting genome segmentation enables extraction of compositionally distinct regions for further downstream analyses.


Subject(s)
Genome , Genomics , Animals , Humans , Mice , Markov Chains , Base Composition , Probability , Algorithms
2.
Science ; 383(6682): 531-537, 2024 Feb 02.
Article in English | MEDLINE | ID: mdl-38301018

ABSTRACT

Large mammalian herbivores (megafauna) have experienced extinctions and declines since prehistory. Introduced megafauna have partly counteracted these losses yet are thought to have unusually negative effects on plants compared with native megafauna. Using a meta-analysis of 3995 plot-scale plant abundance and diversity responses from 221 studies, we found no evidence that megafauna impacts were shaped by nativeness, "invasiveness," "feralness," coevolutionary history, or functional and phylogenetic novelty. Nor was there evidence that introduced megafauna facilitate introduced plants more than native megafauna. Instead, we found strong evidence that functional traits shaped megafauna impacts, with larger-bodied and bulk-feeding megafauna promoting plant diversity. Our work suggests that trait-based ecology provides better insight into interactions between megafauna and plants than do concepts of nativeness.


Subject(s)
Ecosystem , Extinction, Biological , Herbivory , Introduced Species , Mammals , Plants , Animals , Ecology , Herbivory/physiology , Phylogeny , Conservation of Natural Resources
3.
Nature ; 625(7996): 735-742, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38030727

ABSTRACT

Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.


Subject(s)
Conserved Sequence , Evolution, Molecular , Genome , Primates , Animals , Female , Humans , Pregnancy , Conserved Sequence/genetics , Deoxyribonuclease I/metabolism , DNA/genetics , DNA/metabolism , Genome/genetics , Mammals/classification , Mammals/genetics , Placenta , Primates/classification , Primates/genetics , Regulatory Sequences, Nucleic Acid/genetics , Reproducibility of Results , Transcription Factors/metabolism , Proteins/genetics , Gene Expression Regulation/genetics
4.
Nat Commun ; 14(1): 7679, 2023 Nov 24.
Article in English | MEDLINE | ID: mdl-37996436

ABSTRACT

The worldwide extinction of megafauna during the Late Pleistocene and Early Holocene is evident from the fossil record, with dominant theories suggesting a climate, human or combined impact cause. Consequently, two disparate scenarios are possible for the surviving megafauna during this time period - they could have declined due to similar pressures, or increased in population size due to reductions in competition or other biotic pressures. We therefore infer population histories of 139 extant megafauna species using genomic data which reveal population declines in 91% of species throughout the Quaternary period, with larger species experiencing the strongest decreases. Declines become ubiquitous 32-76 kya across all landmasses, a pattern better explained by worldwide Homo sapiens expansion than by changes in climate. We estimate that, in consequence, total megafauna abundance, biomass, and energy turnover decreased by 92-95% over the past 50,000 years, implying major human-driven ecosystem restructuring at a global scale.


Subject(s)
Climate Change , Ecosystem , Humans , Animals , Extinction, Biological , Fossils , Biomass
5.
Nat Ecol Evol ; 7(7): 1114-1130, 2023 07.
Article in English | MEDLINE | ID: mdl-37268856

ABSTRACT

The Y chromosome usually plays a critical role in determining male sex and comprises sequence classes that have experienced unique evolutionary trajectories. Here we generated 19 new primate sex chromosome assemblies, analysed them with 10 existing assemblies and report rapid evolution of the Y chromosome across primates. The pseudoautosomal boundary has shifted at least six times during primate evolution, leading to the formation of a Simiiformes-specific evolutionary stratum and to the independent start of young strata in Catarrhini and Platyrrhini. Different primate lineages experienced different rates of gene loss and structural and chromatin change on their Y chromosomes. Selection on several Y-linked genes has contributed to the evolution of male developmental traits across the primates. Additionally, lineage-specific expansions of ampliconic regions have further increased the diversification of the structure and gene composition of the Y chromosome. Overall, our comprehensive analysis has broadened our knowledge of the evolution of the primate Y chromosome.


Subject(s)
Evolution, Molecular , Y Chromosome , Animals , Male , Y Chromosome/genetics , Primates/genetics
6.
Science ; 380(6648): 906-913, 2023 06 02.
Article in English | MEDLINE | ID: mdl-37262161

ABSTRACT

The rich diversity of morphology and behavior displayed across primate species provides an informative context in which to study the impact of genomic diversity on fundamental biological processes. Analysis of that diversity provides insight into long-standing questions in evolutionary and conservation biology and is urgent given severe threats these species are facing. Here, we present high-coverage whole-genome data from 233 primate species representing 86% of genera and all 16 families. This dataset was used, together with fossil calibration, to create a nuclear DNA phylogeny and to reassess evolutionary divergence times among primate clades. We found within-species genetic diversity across families and geographic regions to be associated with climate and sociality, but not with extinction risk. Furthermore, mutation rates differ across species, potentially influenced by effective population sizes. Lastly, we identified extensive recurrence of missense mutations previously thought to be human specific. This study will open a wide range of research avenues for future primate genomic research.


Subject(s)
Biological Evolution , Genetic Variation , Primates , Animals , Humans , Genome , Mutation Rate , Phylogeny , Primates/genetics , Population Density
7.
Science ; 380(6648): eabn8153, 2023 06 02.
Article in English | MEDLINE | ID: mdl-37262156

ABSTRACT

Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases.


Subject(s)
Genetic Variation , Primates , Animals , Humans , Base Sequence , Gene Frequency , Primates/genetics , Whole Genome Sequencing
8.
Science ; 380(6648): eabn8153, 2023 06 02.
Article in English | MEDLINE | ID: mdl-37262153

ABSTRACT

Baboons (genus Papio) are a morphologically and behaviorally diverse clade of catarrhine monkeys that have experienced hybridization between phenotypically and genetically distinct phylogenetic species. We used high-coverage whole-genome sequences from 225 wild baboons representing 19 geographic localities to investigate population genomics and interspecies gene flow. Our analyses provide an expanded picture of evolutionary reticulation among species and reveal patterns of population structure within and among species, including differential admixture among conspecific populations. We describe the first example of a baboon population with a genetic composition that is derived from three distinct lineages. The results reveal processes, both ancient and recent, that produced the observed mismatch between phylogenetic relationships based on matrilineal, patrilineal, and biparental inheritance. We also identified several candidate genes that may contribute to species-specific phenotypes.


Subject(s)
Biological Evolution , Gene Flow , Papio , Animals , Male , Papio/anatomy & histology , Papio/genetics , Phenotype , Phylogeny , Species Specificity , Sex Characteristics
9.
bioRxiv ; 2023 May 02.
Article in English | MEDLINE | ID: mdl-37205491

ABSTRACT

Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human. We show that these variants can be inferred to have non-deleterious effects in human based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases. One Sentence Summary: Deep learning classifier trained on 4.3 million common primate missense variants predicts variant pathogenicity in humans.

10.
Genome Biol ; 23(1): 215, 2022 10 17.
Article in English | MEDLINE | ID: mdl-36253794

ABSTRACT

BACKGROUND: The pseudoautosomal region 1 (PAR1) is a 2.7 Mb telomeric region of human sex chromosomes. PAR1 has a crucial role in ensuring proper segregation of sex chromosomes during male meiosis, exposing it to extreme recombination and mutation processes. We investigate PAR1 evolution using population genomic datasets of extant humans, eight populations of great apes, and two archaic human genome sequences. RESULTS: We find that PAR1 is fast evolving and closer to evolutionary nucleotide equilibrium than autosomal telomeres. We detect a difference between substitution patterns and extant diversity in PAR1, mainly driven by the conflict between strong mutation and recombination-associated fixation bias at CpG sites. We detect excess C-to-G mutations in PAR1 of all great apes, specific to the mutagenic effect of male recombination. Despite recent evidence for Y chromosome introgression from humans into Neanderthals, we find that the Neanderthal PAR1 retained similarity to the Denisovan sequence. We find differences between substitution spectra of these archaics suggesting rapid evolution of PAR1 in recent hominin history. Frequency analysis of alleles segregating in females and males provided no evidence for recent sexual antagonism in this region. We study repeat content and double-strand break hotspot regions in PAR1 and find that they may play roles in ensuring the obligate X-Y recombination event during male meiosis. CONCLUSIONS: Our study provides an unprecedented quantification of population genetic forces governing PAR1 biology across extant and extinct hominids. PAR1 evolutionary dynamics are predominantly governed by recombination processes with a strong impact on mutation patterns across all species.


Subject(s)
Hominidae , Pseudoautosomal Regions , Animals , Female , Hominidae/genetics , Humans , Male , Nucleotides , Receptor, PAR-1/genetics , Y Chromosome/genetics
11.
Genetics ; 218(3)2021 07 14.
Article in English | MEDLINE | ID: mdl-34081117

ABSTRACT

The nucleotide composition of the genome is a balance between the origin and fixation rates of different mutations. For example, it is well-known that transitions occur more frequently than transversions, particularly at CpG sites. Differences in fixation rates of mutation types are less explored. Specifically, recombination-associated GC-biased gene conversion (gBGC) may differentially impact GC-changing mutations, due to differences in their genomic distributions and efficiency of mismatch repair mechanisms. Given that recombination evolves rapidly across species, we explore gBGC of different mutation types across human populations and great ape species. We report a stronger correlation between segregating GC frequency and recombination for transitions than for transversions. Notably, CpG transitions are most strongly affected by gBGC in humans and chimpanzees. We show that the overall strength of gBGC is generally correlated with effective population sizes in humans, with some notable exceptions, such as a stronger effect of gBGC on non-CpG transitions in populations of European descent. Furthermore, species of the Gorilla and Pongo genus have a greatly reduced gBGC effect on CpG sites. We also study the dependence of gBGC dynamics on flanking nucleotides and show that some mutation types evolve in opposition to the gBGC expectation, likely due to the hypermutability of specific nucleotide contexts. Our results highlight the importance of different gBGC dynamics experienced by GC-changing mutations and their impact on nucleotide composition evolution.


Subject(s)
CpG Islands/genetics , Mutation Rate , Pan troglodytes/genetics , Population/genetics , Animals , Gene Conversion , Genome, Human , Humans , Recombination, Genetic
12.
Nature ; 594(7862): 227-233, 2021 06.
Article in English | MEDLINE | ID: mdl-33910227

ABSTRACT

The accurate and complete assembly of both haplotype sequences of a diploid organism is essential to understanding the role of variation in genome functions, phenotypes and diseases1. Here, using a trio-binning approach, we present a high-quality, diploid reference genome, with both haplotypes assembled independently at the chromosome level, for the common marmoset (Callithrix jacchus), an primate model system that is widely used in biomedical research2,3. The full spectrum of heterozygosity between the two haplotypes involves 1.36% of the genome-much higher than the 0.13% indicated by the standard estimation based on single-nucleotide heterozygosity alone. The de novo mutation rate is 0.43 × 10-8 per site per generation, and the paternal inherited genome acquired twice as many mutations as the maternal. Our diploid assembly enabled us to discover a recent expansion of the sex-differentiation region and unique evolutionary changes in the marmoset Y chromosome. In addition, we identified many genes with signatures of positive selection that might have contributed to the evolution of Callithrix biological features. Brain-related genes were highly conserved between marmosets and humans, although several genes experienced lineage-specific copy number variations or diversifying selection, with implications for the use of marmosets as a model system.


Subject(s)
Callithrix/genetics , Diploidy , Evolution, Molecular , Genome/genetics , Genomics/standards , Animals , Biomedical Research , DNA Copy Number Variations , Female , Germ-Line Mutation/genetics , Haplotypes/genetics , Heterozygote , Humans , INDEL Mutation/genetics , Male , Reference Standards , Selection, Genetic , Sex Differentiation/genetics , Y Chromosome/genetics
13.
Mol Biol Evol ; 36(5): 990-998, 2019 05 01.
Article in English | MEDLINE | ID: mdl-30903659

ABSTRACT

A long-standing question in evolutionary biology is the relative contribution of large and small effect mutations to the adaptive process. We have investigated this question in proteins by estimating the rate of adaptive evolution between all pairs of amino acids separated by one mutational step using a McDonald-Kreitman type approach and genome-wide data from several Drosophila species. We find that the rate of adaptive evolution is highest among amino acids that are more similar. This is partly due to the fact that the proportion of mutations that are adaptive is higher among more similar amino acids. We also find that the rate of neutral evolution between amino acids is higher among more similar amino acids. Overall our results suggest that both the adaptive and nonadaptive evolution of proteins are dominated by substitutions between similar amino acids.


Subject(s)
Adaptation, Biological/genetics , Amino Acid Substitution , Evolution, Molecular , Insect Proteins/genetics , Mutation , Amino Acids/chemistry , Amino Acids/genetics , Animals , Drosophila melanogaster
14.
J Theor Biol ; 439: 166-180, 2018 02 14.
Article in English | MEDLINE | ID: mdl-29229523

ABSTRACT

A central aim of population genetics is the inference of the evolutionary history of a population. To this end, the underlying process can be represented by a model of the evolution of allele frequencies parametrized by e.g., the population size, mutation rates and selection coefficients. A large class of models use forward-in-time models, such as the discrete Wright-Fisher and Moran models and the continuous forward diffusion, to obtain distributions of population allele frequencies, conditional on an ancestral initial allele frequency distribution. Backward-in-time diffusion processes have been rarely used in the context of parameter inference. Here, we demonstrate how forward and backward diffusion processes can be combined to efficiently calculate the exact joint probability distribution of sample and population allele frequencies at all times in the past, for both discrete and continuous population genetics models. This procedure is analogous to the forward-backward algorithm of hidden Markov models. While the efficiency of discrete models is limited by the population size, for continuous models it suffices to expand the transition density in orthogonal polynomials of the order of the sample size to infer marginal likelihoods of population genetic parameters. Additionally, conditional allele trajectories and marginal likelihoods of samples from single populations or from multiple populations that split in the past can be obtained. The described approaches allow for efficient maximum likelihood inference of population genetic parameters in a wide variety of demographic scenarios.


Subject(s)
Genetics, Population/methods , Models, Genetic , Algorithms , Biological Evolution , Gene Frequency , Likelihood Functions , Markov Chains , Methods , Population Density , Time
15.
Genome Biol Evol ; 10(1): 269-275, 2018 01 01.
Article in English | MEDLINE | ID: mdl-29036491

ABSTRACT

In many organisms, local deviations from Chargaff's second parity rule are observed around replication and transcription start sites and within intron sequences. Here, we use expression data as well as a whole-genome data set of nearly 200 haplotypes to investigate such compositional skews in Drosophila melanogaster genes. We find a positive correlation between compositional skew and gene expression, comparable in strength to similar correlations between expression levels and genome-wide sequence features. This correlation is relatively stronger for germline, compared with somatic expression, consistent with the process of transcription-associated mutation bias. We also inferred mutation rates from alleles segregating at low frequencies in short introns, and show that, whereas the overall GC content of short introns does not conform to the equilibrium expectation, the level of the observed deviation from the second parity rule is generally consistent with the inferred rates.


Subject(s)
Drosophila melanogaster/genetics , Evolution, Molecular , Genes, Insect , Transcription, Genetic , Animals , Base Composition , Mutation Accumulation
16.
Theor Popul Biol ; 106: 71-82, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26597774

ABSTRACT

In a classical study, Wright (1931) proposed a model for the evolution of a biallelic locus under the influence of mutation, directional selection and drift. He derived the equilibrium distribution of the allelic proportion conditional on the scaled mutation rate, the mutation bias and the scaled strength of directional selection. The equilibrium distribution can be used for inference of these parameters with genome-wide datasets of "site frequency spectra" (SFS). Assuming that the scaled mutation rate is low, Wright's model can be approximated by a boundary-mutation model, where mutations are introduced into the population exclusively from sites fixed for the preferred or unpreferred allelic states. With the boundary-mutation model, inference can be partitioned: (i) the shape of the SFS distribution within the polymorphic region is determined by random drift and directional selection, but not by the mutation parameters, such that inference of the selection parameter relies exclusively on the polymorphic sites in the SFS; (ii) the mutation parameters can be inferred from the amount of polymorphic and monomorphic preferred and unpreferred alleles, conditional on the selection parameter. Herein, we derive maximum likelihood estimators for the mutation and selection parameters in equilibrium and apply the method to simulated SFS data as well as empirical data from a Madagascar population of Drosophila simulans.


Subject(s)
Drosophila simulans/genetics , Genetics, Population/methods , Selection, Genetic , Alleles , Animals , Evolution, Molecular , Gene Frequency , Likelihood Functions , Models, Genetic , Mutation
17.
Food Technol Biotechnol ; 53(4): 367-378, 2015 Dec.
Article in English | MEDLINE | ID: mdl-27904371

ABSTRACT

Sporulation efficiency in the yeast Saccharomyces cerevisiae is a well-established model for studying quantitative traits. A variety of genes and nucleotides causing different sporulation efficiencies in laboratory, as well as in wild strains, has already been extensively characterised (mainly by reciprocal hemizygosity analysis and nucleotide exchange methods). We applied a different strategy in order to analyze the variation in sporulation efficiency of laboratory yeast strains. Coupling classical quantitative genetic analysis with simulations of phenotypic distributions (a method we call phenotype modelling) enabled us to obtain a detailed picture of the quantitative trait loci (QTLs) relationships underlying the phenotypic variation of this trait. Using this approach, we were able to uncover a dominant epistatic inheritance of loci governing the phenotype. Moreover, a molecular analysis of known causative quantitative trait genes and nucleotides allowed for the detection of novel alleles, potentially responsible for the observed phenotypic variation. Based on the molecular data, we hypothesise that the observed dominant epistatic relationship could be caused by the interaction of multiple quantitative trait nucleotides distributed across a 60--kb QTL region located on chromosome XIV and the RME1 locus on chromosome VII. Furthermore, we propose a model of molecular pathways which possibly underlie the phenotypic variation of this trait.

SELECTION OF CITATIONS
SEARCH DETAIL
...