Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Syst Biol ; 71(2): 301-319, 2022 02 10.
Article in English | MEDLINE | ID: mdl-33983440

ABSTRACT

The tree of life is the fundamental biological roadmap for navigating the evolution and properties of life on Earth, and yet remains largely unknown. Even angiosperms (flowering plants) are fraught with data gaps, despite their critical role in sustaining terrestrial life. Today, high-throughput sequencing promises to significantly deepen our understanding of evolutionary relationships. Here, we describe a comprehensive phylogenomic platform for exploring the angiosperm tree of life, comprising a set of open tools and data based on the 353 nuclear genes targeted by the universal Angiosperms353 sequence capture probes. The primary goals of this article are to (i) document our methods, (ii) describe our first data release, and (iii) present a novel open data portal, the Kew Tree of Life Explorer (https://treeoflife.kew.org). We aim to generate novel target sequence capture data for all genera of flowering plants, exploiting natural history collections such as herbarium specimens, and augment it with mined public data. Our first data release, described here, is the most extensive nuclear phylogenomic data set for angiosperms to date, comprising 3099 samples validated by DNA barcode and phylogenetic tests, representing all 64 orders, 404 families (96$\%$) and 2333 genera (17$\%$). A "first pass" angiosperm tree of life was inferred from the data, which totaled 824,878 sequences, 489,086,049 base pairs, and 532,260 alignment columns, for interactive presentation in the Kew Tree of Life Explorer. This species tree was generated using methods that were rigorous, yet tractable at our scale of operation. Despite limitations pertaining to taxon and gene sampling, gene recovery, models of sequence evolution and paralogy, the tree strongly supports existing taxonomy, while challenging numerous hypothesized relationships among orders and placing many genera for the first time. The validated data set, species tree and all intermediates are openly accessible via the Kew Tree of Life Explorer and will be updated as further data become available. This major milestone toward a complete tree of life for all flowering plant species opens doors to a highly integrated future for angiosperm phylogenomics through the systematic sequencing of standardized nuclear markers. Our approach has the potential to serve as a much-needed bridge between the growing movement to sequence the genomes of all life on Earth and the vast phylogenomic potential of the world's natural history collections. [Angiosperms; Angiosperms353; genomics; herbariomics; museomics; nuclear phylogenomics; open access; target sequence capture; tree of life.].


Subject(s)
Magnoliopsida , Genomics , High-Throughput Nucleotide Sequencing , Humans , Magnoliopsida/genetics , Phylogeny
3.
Nature ; 586(7831): 741-748, 2020 10.
Article in English | MEDLINE | ID: mdl-33116287

ABSTRACT

The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed1. Here we performed whole-genome sequencing analyses of 426 individuals-comprising 50 ethnolinguistic groups, including previously unsampled populations-to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon-but in other genes, variants denoted as 'likely pathogenic' in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health.


Subject(s)
Genetic Variation , Genome, Human/genetics , Genomics , Health , Human Migration , Africa/ethnology , DNA Repair/genetics , Datasets as Topic , Female , Gene Flow , Genetics, Medical , Genetics, Population , Health/history , History, Ancient , Human Migration/history , Humans , Immunity/genetics , Language , Male , Metabolism/genetics , Selection, Genetic , Whole Genome Sequencing
4.
Nat Plants ; 5(11): 1120-1128, 2019 11.
Article in English | MEDLINE | ID: mdl-31685951

ABSTRACT

Tetraploid emmer wheat (Triticum turgidum ssp. dicoccon) is a progenitor of the world's most widely grown crop, hexaploid bread wheat (Triticum aestivum), as well as the direct ancestor of tetraploid durum wheat (T. turgidum subsp. turgidum). Emmer was one of the first cereals to be domesticated in the old world; it was cultivated from around 9700 BC in the Levant1,2 and subsequently in south-western Asia, northern Africa and Europe with the spread of Neolithic agriculture3,4. Here, we report a whole-genome sequence from a museum specimen of Egyptian emmer wheat chaff, 14C dated to the New Kingdom, 1130-1000 BC. Its genome shares haplotypes with modern domesticated emmer at loci that are associated with shattering, seed size and germination, as well as within other putative domestication loci, suggesting that these traits share a common origin before the introduction of emmer to Egypt. Its genome is otherwise unusual, carrying haplotypes that are absent from modern emmer. Genetic similarity with modern Arabian and Indian emmer landraces connects ancient Egyptian emmer with early south-eastern dispersals, whereas inferred gene flow with wild emmer from the Southern Levant signals a later connection. Our results show the importance of museum collections as sources of genetic data to uncover the history and diversity of ancient cereals.


Subject(s)
Domestication , Genome, Plant , Triticum/genetics , DNA, Plant , Edible Grain/history , Egypt , History, Ancient , Phylogeny , Sequence Analysis, DNA
5.
Syst Biol ; 68(4): 594-606, 2019 07 01.
Article in English | MEDLINE | ID: mdl-30535394

ABSTRACT

Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants). We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5-15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself.


Subject(s)
DNA Probes , Magnoliopsida/genetics , Sequence Analysis, DNA/methods , Cluster Analysis
6.
Am J Bot ; 105(3): 614-622, 2018 03.
Article in English | MEDLINE | ID: mdl-29603138

ABSTRACT

Providing science and society with an integrated, up-to-date, high quality, open, reproducible and sustainable plant tree of life would be a huge service that is now coming within reach. However, synthesizing the growing body of DNA sequence data in the public domain and disseminating the trees to a diverse audience are often not straightforward due to numerous informatics barriers. While big synthetic plant phylogenies are being built, they remain static and become quickly outdated as new data are published and tree-building methods improve. Moreover, the body of existing phylogenetic evidence is hard to navigate and access for non-experts. We propose that our community of botanists, tree builders, and informaticians should converge on a modular framework for data integration and phylogenetic analysis, allowing easy collaboration, updating, data sourcing and flexible analyses. With support from major institutions, this pipeline should be re-run at regular intervals, storing trees and their metadata long-term. Providing the trees to a diverse global audience through user-friendly front ends and application development interfaces should also be a priority. Interactive interfaces could be used to solicit user feedback and thus improve data quality and to coordinate the generation of new data. We conclude by outlining a number of steps that we suggest the scientific community should take to achieve global phylogenetic synthesis.


Subject(s)
Information Dissemination , Information Management , Phylogeny , Plants/genetics , DNA, Plant , Humans , Information Technology , Sequence Analysis, DNA
7.
Nat Commun ; 8: 16082, 2017 07 18.
Article in English | MEDLINE | ID: mdl-28719574

ABSTRACT

Europe has played a major role in dog evolution, harbouring the oldest uncontested Palaeolithic remains and having been the centre of modern dog breed creation. Here we sequence the genomes of an Early and End Neolithic dog from Germany, including a sample associated with an early European farming community. Both dogs demonstrate continuity with each other and predominantly share ancestry with modern European dogs, contradicting a previously suggested Late Neolithic population replacement. We find no genetic evidence to support the recent hypothesis proposing dual origins of dog domestication. By calibrating the mutation rate using our oldest dog, we narrow the timing of dog domestication to 20,000-40,000 years ago. Interestingly, we do not observe the extreme copy number expansion of the AMY2B gene characteristic of modern dogs that has previously been proposed as an adaptation to a starch-rich diet driven by the widespread adoption of agriculture in the Neolithic.


Subject(s)
Biological Evolution , DNA, Mitochondrial/genetics , Dogs/genetics , Genome , Animals , Domestication , Genetic Variation , Phylogeography
8.
Proc Natl Acad Sci U S A ; 113(4): E440-9, 2016 Jan 26.
Article in English | MEDLINE | ID: mdl-26712023

ABSTRACT

The Out-of-Africa (OOA) dispersal ∼ 50,000 y ago is characterized by a series of founder events as modern humans expanded into multiple continents. Population genetics theory predicts an increase of mutational load in populations undergoing serial founder effects during range expansions. To test this hypothesis, we have sequenced full genomes and high-coverage exomes from seven geographically divergent human populations from Namibia, Congo, Algeria, Pakistan, Cambodia, Siberia, and Mexico. We find that individual genomes vary modestly in the overall number of predicted deleterious alleles. We show via spatially explicit simulations that the observed distribution of deleterious allele frequencies is consistent with the OOA dispersal, particularly under a model where deleterious mutations are recessive. We conclude that there is a strong signal of purifying selection at conserved genomic positions within Africa, but that many predicted deleterious mutations have evolved as if they were neutral during the expansion out of Africa. Under a model where selection is inversely related to dominance, we show that OOA populations are likely to have a higher mutation load due to increased allele frequencies of nearly neutral variants that are recessive or partially recessive.


Subject(s)
Ethnicity/genetics , Genome, Human , Human Migration , Mutation , Africa South of the Sahara , Alleles , Animals , Asian People/genetics , Black People/genetics , Computer Simulation , Conserved Sequence , Evolution, Molecular , Founder Effect , Gene Flow , Genetic Diseases, Inborn/genetics , Genetic Drift , Genotype , Homing Behavior , Humans , Indians, Central American/genetics , Models, Genetic , Selection, Genetic
9.
Nat Rev Genet ; 16(6): 333-43, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25963372

ABSTRACT

Next-generation sequencing technology has facilitated the discovery of millions of genetic variants in human genomes. A sizeable fraction of these variants are predicted to be deleterious. Here, we review the pattern of deleterious alleles as ascertained in genome sequencing data sets and ask whether human populations differ in their predicted burden of deleterious alleles - a phenomenon known as mutation load. We discuss three demographic models that are predicted to affect mutation load and relate these models to the evidence (or the lack thereof) for variation in the efficacy of purifying selection in diverse human genomes. We also emphasize why accurate estimation of mutation load depends on assumptions regarding the distribution of dominance and selection coefficients - quantities that remain poorly characterized for current genomic data sets.


Subject(s)
Genome, Human , Founder Effect , Gene Frequency , Genes, Dominant , Genetic Drift , Human Migration , Humans , Models, Genetic , Mutation , Selection, Genetic
10.
Proc Natl Acad Sci U S A ; 110(29): 11791-6, 2013 Jul 16.
Article in English | MEDLINE | ID: mdl-23733930

ABSTRACT

Human genetic diversity in southern Europe is higher than in other regions of the continent. This difference has been attributed to postglacial expansions, the demic diffusion of agriculture from the Near East, and gene flow from Africa. Using SNP data from 2,099 individuals in 43 populations, we show that estimates of recent shared ancestry between Europe and Africa are substantially increased when gene flow from North Africans, rather than Sub-Saharan Africans, is considered. The gradient of North African ancestry accounts for previous observations of low levels of sharing with Sub-Saharan Africa and is independent of recent gene flow from the Near East. The source of genetic diversity in southern Europe has important biomedical implications; we find that most disease risk alleles from genome-wide association studies follow expected patterns of divergence between Europe and North Africa, with the principal exception of multiple sclerosis.


Subject(s)
Gene Flow/genetics , Genetic Variation , Genetics, Population , White People/genetics , White People/history , Africa, Northern , Demography , Europe , Haplotypes/genetics , History, Ancient , Humans , Polymorphism, Single Nucleotide/genetics
11.
PLoS Genet ; 9(2): e1003316, 2013.
Article in English | MEDLINE | ID: mdl-23468648

ABSTRACT

The Levant is a region in the Near East with an impressive record of continuous human existence and major cultural developments since the Paleolithic period. Genetic and archeological studies present solid evidence placing the Middle East and the Arabian Peninsula as the first stepping-stone outside Africa. There is, however, little understanding of demographic changes in the Middle East, particularly the Levant, after the first Out-of-Africa expansion and how the Levantine peoples relate genetically to each other and to their neighbors. In this study we analyze more than 500,000 genome-wide SNPs in 1,341 new samples from the Levant and compare them to samples from 48 populations worldwide. Our results show recent genetic stratifications in the Levant are driven by the religious affiliations of the populations within the region. Cultural changes within the last two millennia appear to have facilitated/maintained admixture between culturally similar populations from the Levant, Arabian Peninsula, and Africa. The same cultural changes seem to have resulted in genetic isolation of other groups by limiting admixture with culturally different neighboring populations. Consequently, Levant populations today fall into two main groups: one sharing more genetic characteristics with modern-day Europeans and Central Asians, and the other with closer genetic affinities to other Middle Easterners and Africans. Finally, we identify a putative Levantine ancestral component that diverged from other Middle Easterners ∼23,700-15,500 years ago during the last glacial period, and diverged from Europeans ∼15,900-9,100 years ago between the last glacial warming and the start of the Neolithic.


Subject(s)
Chromosomes, Human, Y/genetics , DNA, Mitochondrial/genetics , Genetic Variation , Genetics, Population , Archaeology , Black People , Cultural Evolution , Ethnicity/genetics , Genome, Human , Haplotypes , Humans , Middle East , Phylogeny , White People
12.
PLoS One ; 7(10): e47765, 2012.
Article in English | MEDLINE | ID: mdl-23082212

ABSTRACT

One of the main findings derived from the analysis of the Neandertal genome was the evidence for admixture between Neandertals and non-African modern humans. An alternative scenario is that the ancestral population of non-Africans was closer to Neandertals than to Africans because of ancient population substructure. Thus, the study of North African populations is crucial for testing both hypotheses. We analyzed a total of 780,000 SNPs in 125 individuals representing seven different North African locations and searched for their ancestral/derived state in comparison to different human populations and Neandertals. We found that North African populations have a significant excess of derived alleles shared with Neandertals, when compared to sub-Saharan Africans. This excess is similar to that found in non-African humans, a fact that can be interpreted as a sign of Neandertal admixture. Furthermore, the Neandertal's genetic signal is higher in populations with a local, pre-Neolithic North African ancestry. Therefore, the detected ancient admixture is not due to recent Near Eastern or European migrations. Sub-Saharan populations are the only ones not affected by the admixture event with Neandertals.


Subject(s)
Gene Pool , Genetics, Population , Neanderthals/genetics , Africa South of the Sahara , Africa, Northern , Animals , Asia , Europe , Genealogy and Heraldry , Humans , Principal Component Analysis
13.
Proc Natl Acad Sci U S A ; 109(34): 13865-70, 2012 Aug 21.
Article in English | MEDLINE | ID: mdl-22869716

ABSTRACT

North African Jews constitute the second largest Jewish Diaspora group. However, their relatedness to each other; to European, Middle Eastern, and other Jewish Diaspora groups; and to their former North African non-Jewish neighbors has not been well defined. Here, genome-wide analysis of five North African Jewish groups (Moroccan, Algerian, Tunisian, Djerban, and Libyan) and comparison with other Jewish and non-Jewish groups demonstrated distinctive North African Jewish population clusters with proximity to other Jewish populations and variable degrees of Middle Eastern, European, and North African admixture. Two major subgroups were identified by principal component, neighbor joining tree, and identity-by-descent analysis-Moroccan/Algerian and Djerban/Libyan-that varied in their degree of European admixture. These populations showed a high degree of endogamy and were part of a larger Ashkenazi and Sephardic Jewish group. By principal component analysis, these North African groups were orthogonal to contemporary populations from North and South Morocco, Western Sahara, Tunisia, Libya, and Egypt. Thus, this study is compatible with the history of North African Jews-founding during Classical Antiquity with proselytism of local populations, followed by genetic isolation with the rise of Christianity and then Islam, and admixture following the emigration of Sephardic Jews during the Inquisition.


Subject(s)
Ethnicity , Jews/genetics , Africa , Black People/genetics , Cluster Analysis , Emigration and Immigration , Genetics, Population , Genome , Haplotypes , Humans , Judaism , Models, Genetic , Oligonucleotide Array Sequence Analysis , Phylogeny , White People/genetics
14.
PLoS Genet ; 8(1): e1002397, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22253600

ABSTRACT

North African populations are distinct from sub-Saharan Africans based on cultural, linguistic, and phenotypic attributes; however, the time and the extent of genetic divergence between populations north and south of the Sahara remain poorly understood. Here, we interrogate the multilayered history of North Africa by characterizing the effect of hypothesized migrations from the Near East, Europe, and sub-Saharan Africa on current genetic diversity. We present dense, genome-wide SNP genotyping array data (730,000 sites) from seven North African populations, spanning from Egypt to Morocco, and one Spanish population. We identify a gradient of likely autochthonous Maghrebi ancestry that increases from east to west across northern Africa; this ancestry is likely derived from "back-to-Africa" gene flow more than 12,000 years ago (ya), prior to the Holocene. The indigenous North African ancestry is more frequent in populations with historical Berber ethnicity. In most North African populations we also see substantial shared ancestry with the Near East, and to a lesser extent sub-Saharan Africa and Europe. To estimate the time of migration from sub-Saharan populations into North Africa, we implement a maximum likelihood dating method based on the distribution of migrant tracts. In order to first identify migrant tracts, we assign local ancestry to haplotypes using a novel, principal component-based analysis of three ancestral populations. We estimate that a migration of western African origin into Morocco began about 40 generations ago (approximately 1,200 ya); a migration of individuals with Nilotic ancestry into Egypt occurred about 25 generations ago (approximately 750 ya). Our genomic data reveal an extraordinarily complex history of migrations, involving at least five ancestral populations, into North Africa.


Subject(s)
Black People/genetics , Gene Flow/genetics , Genetic Variation , Population Dynamics , Population , Africa South of the Sahara/ethnology , Africa, Northern , Black People/history , DNA, Mitochondrial/genetics , Egypt, Ancient , Emigration and Immigration , Europe , Gene Pool , Genomics , Genotype , Haplotypes , History, Ancient , Humans , Middle East , Morocco , Polymorphism, Single Nucleotide , White People/genetics , White People/history
15.
Hum Biol ; 79(6): 679-86, 2007 Dec.
Article in English | MEDLINE | ID: mdl-18494377

ABSTRACT

A population sample from São Tomé e Príncipe (West Africa) was screened for the G6PD-deficient variants A- (376G/202A), Betica (376G/968C), and Santa Maria (376G/542T). G6PD locus haplotype diversity was also investigated using six intragenic RFLPs (FokI, PvuII, BspHI, PstI, BclI, NlaIII) and a (CTT)n microsatellite 18.61 kb within the G6PD locus. The estimated frequencies of the G6PD*B normal allele, the G6PD*A variant (376G), and the G6PD*A- allele were 0.698, 0.194, and 0.108, respectively. G6PD variants Betica and Santa Maria were not found. Similar levels of microsatellite diversity were found on variants G6PD*B and G6PD*A (H = 0.61 and 0.68, respectively), indicating a similar age for both alleles. All G6PD*A- alleles share the RFLP-microsatellite haplotype ++(-)+(-)+/195, the same haplotype described in nearly all the *A-alleles from sub-Saharan, Mexican Mestizo, and Portuguese populations, consistent with a single and recent origin of the G202A mutation on this *A haplotype.


Subject(s)
Glucosephosphate Dehydrogenase Deficiency/genetics , Glucosephosphate Dehydrogenase/genetics , Africa, Western/epidemiology , Female , Glucosephosphate Dehydrogenase Deficiency/classification , Glucosephosphate Dehydrogenase Deficiency/epidemiology , Haplotypes , Humans , Male , Polymorphism, Genetic , Polymorphism, Restriction Fragment Length
SELECTION OF CITATIONS
SEARCH DETAIL
...