Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Genomics ; 15: 387, 2014 May 20.
Article in English | MEDLINE | ID: mdl-24885025

ABSTRACT

BACKGROUND: Although the reference human genome sequence was declared finished in 2003, some regions of the genome remain incomplete due to their complex architecture. One such region, 1q21.1-q21.2, is of increasing interest due to its relevance to human disease and evolution. Elucidation of the exact variants behind these associations has been hampered by the repetitive nature of the region and its incomplete assembly. This region also contains 238 of the 270 human DUF1220 protein domains, which are implicated in human brain evolution and neurodevelopment. Additionally, examinations of this protein domain have been challenging due to the incomplete 1q21 build. To address these problems, a single-haplotype hydatidiform mole BAC library (CHORI-17) was used to produce the first complete sequence of the 1q21.1-q21.2 region. RESULTS: We found and addressed several inaccuracies in the GRCh37sequence of the 1q21 region on large and small scales, including genomic rearrangements and inversions, and incorrect gene copy number estimates and assemblies. The DUF1220-encoding NBPF genes required the most corrections, with 3 genes removed, 2 genes reassigned to the 1p11.2 region, 8 genes requiring assembly corrections for DUF1220 domains (~91 DUF1220 domains were misassigned), and multiple instances of nucleotide changes that reassigned the domain to a different DUF1220 subtype. These corrections resulted in an overall increase in DUF1220 copy number, yielding a haploid total of 289 copies. Approximately 20 of these new DUF1220 copies were the result of a segmental duplication from 1q21.2 to 1p11.2 that included two NBPF genes. Interestingly, this duplication may have been the catalyst for the evolutionarily important human lineage-specific chromosome 1 pericentric inversion. CONCLUSIONS: Through the hydatidiform mole genome sequencing effort, the 1q21.1-q21.2 region is complete and misassemblies involving inter- and intra-region duplications have been resolved. The availability of this single haploid sequence path will aid in the investigation of many genetic diseases linked to 1q21, including several associated with DUF1220 copy number variations. Finally, the corrected sequence identified a recent segmental duplication that added 20 additional DUF1220 copies to the human genome, and may have facilitated the chromosome 1 pericentric inversion that is among the most notable human-specific genomic landmarks.


Subject(s)
Chromosomes, Human, Pair 1 , Genome, Human , Biological Evolution , Carrier Proteins/genetics , Comparative Genomic Hybridization , DNA Copy Number Variations , Genetic Linkage , Haploidy , Humans , Protein Structure, Tertiary/genetics , Segmental Duplications, Genomic
2.
Nat Rev Genet ; 13(12): 853-66, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23154808

ABSTRACT

Given the unprecedented tools that are now available for rapidly comparing genomes, the identification and study of genetic and genomic changes that are unique to our species have accelerated, and we are entering a golden age of human evolutionary genomics. Here we provide an overview of these efforts, highlighting important recent discoveries, examples of the different types of human-specific genomic and genetic changes identified, and salient trends, such as the localization of evolutionary adaptive changes to complex loci that are highly enriched for disease associations. Finally, we discuss the remaining challenges, such as the incomplete nature of current genome sequence assemblies and difficulties in linking human-specific genomic changes to human-specific phenotypic traits.


Subject(s)
Evolution, Molecular , Genome, Human , Animals , Chromosomes, Human/genetics , Comparative Genomic Hybridization , Genetic Speciation , Genomics/trends , Hominidae/genetics , Humans , Karyotyping , Phenotype , Primates/genetics , Species Specificity
3.
G3 (Bethesda) ; 2(9): 977-86, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22973535

ABSTRACT

DUF1220 protein domains exhibit the most extreme human lineage-specific (HLS) copy number increase of any protein coding region in the human genome and have recently been linked to evolutionary and pathological changes in brain size (e.g., 1q21-associated microcephaly). These findings lend support to the view that DUF1220 domain dosage is a key factor in the determination of primate (and human) brain size. Here we analyze 41 animal genomes and present the most complete account to date of the evolutionary history and genome organization of DUF1220 domains and the gene family that encodes them (NBPF). Included among the novel features identified by this analysis is a DUF1220 domain precursor in nonmammalian vertebrates, a unique predicted promoter common to all mammalian NBPF genes, six distinct clades into which DUF1220 sequences can be subdivided, and a previously unknown member of the NBPF gene family (NBPF25). Most importantly, we show that the exceptional HLS increase in DUF1220 copy number (from 102 in our last common ancestor with chimp to 272 in human; an average HLS increase of ~28 copies every million years since the Homo/Pan split) was driven by intragenic domain hyperamplification. This increase primarily involved a 4.7 kb, tandemly repeated three DUF1220 domain unit we have named the HLS DUF1220 triplet, a motif that is a likely candidate to underlie key properties unique to the Homo sapiens brain. Interestingly, all copies of the HLS DUF1220 triplet lie within a human-specific pericentric inversion that also includes the 1q12 C-band, a polymorphic heterochromatin expansion that is unique to the human genome. Both cytogenetic features likely played key roles in the rapid HLS DUF1220 triplet hyperamplification, which is among the most striking genomic changes specific to the human lineage.


Subject(s)
Evolution, Molecular , Genome , Protein Structure, Tertiary/genetics , Animals , Gene Dosage , Gene Order , Humans , Molecular Sequence Annotation , Multigene Family , Phylogeny , Physical Chromosome Mapping , Proteins/chemistry , Proteins/genetics
4.
Am J Hum Genet ; 91(3): 444-54, 2012 Sep 07.
Article in English | MEDLINE | ID: mdl-22901949

ABSTRACT

DUF1220 domains show the largest human-lineage-specific increase in copy number of any protein-coding region in the human genome and map primarily to 1q21, where deletions and reciprocal duplications have been associated with microcephaly and macrocephaly, respectively. Given these findings and the high correlation between DUF1220 copy number and brain size across primate lineages (R(2) = 0.98; p = 1.8 × 10(-6)), DUF1220 sequences represent plausible candidates for underlying 1q21-associated brain-size pathologies. To investigate this possibility, we used specialized bioinformatics tools developed for scoring highly duplicated DUF1220 sequences to implement targeted 1q21 array comparative genomic hybridization on individuals (n = 42) with 1q21-associated microcephaly and macrocephaly. We show that of all the 1q21 genes examined (n = 53), DUF1220 copy number shows the strongest association with brain size among individuals with 1q21-associated microcephaly, particularly with respect to the three evolutionarily conserved DUF1220 clades CON1(p = 0.0079), CON2 (p = 0.0134), and CON3 (p = 0.0116). Interestingly, all 1q21 DUF1220-encoding genes belonging to the NBPF family show significant correlations with frontal-occipital-circumference Z scores in the deletion group. In a similar survey of a nondisease population, we show that DUF1220 copy number exhibits the strongest correlation with brain gray-matter volume (CON1, p = 0.0246; and CON2, p = 0.0334). Notably, only DUF1220 sequences are consistently significant in both disease and nondisease populations. Taken together, these data strongly implicate the loss of DUF1220 copy number in the etiology of 1q21-associated microcephaly and support the view that DUF1220 domains function as general effectors of evolutionary, pathological, and normal variation in brain size.


Subject(s)
Brain/pathology , DNA Copy Number Variations , Organ Size , Animals , Base Sequence , Biological Evolution , Chromosomes, Human, Pair 1 , Comparative Genomic Hybridization , Gene Duplication , Humans , Megalencephaly/genetics , Microcephaly/genetics
5.
Plant Physiol ; 148(4): 1760-71, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18952860

ABSTRACT

Retrotransposons and their remnants often constitute more than 50% of higher plant genomes. Although extensively studied in monocot crops such as maize (Zea mays) and rice (Oryza sativa), the impact of retrotransposons on dicot crop genomes is not well documented. Here, we present an analysis of retrotransposons in soybean (Glycine max). Analysis of approximately 3.7 megabases (Mb) of genomic sequence, including 0.87 Mb of pericentromeric sequence, uncovered 45 intact long terminal repeat (LTR)-retrotransposons. The ratio of intact elements to solo LTRs was 8:1, one of the highest reported to date in plants, suggesting that removal of retrotransposons by homologous recombination between LTRs is occurring more slowly in soybean than in previously characterized plant species. Analysis of paired LTR sequences uncovered a low frequency of deletions relative to base substitutions, indicating that removal of retrotransposon sequences by illegitimate recombination is also operating more slowly. Significantly, we identified three subfamilies of nonautonomous elements that have replicated in the recent past, suggesting that retrotransposition can be catalyzed in trans by autonomous elements elsewhere in the genome. Analysis of 1.6 Mb of sequence from Glycine tomentella, a wild perennial relative of soybean, uncovered 23 intact retroelements, two of which had accumulated no mutations in their LTRs, indicating very recent insertion. A similar pattern was found in 0.94 Mb of sequence from Phaseolus vulgaris (common bean). Thus, autonomous and nonautonomous retrotransposons appear to be both abundant and active in Glycine and Phaseolus. The impact of nonautonomous retrotransposon replication on genome size appears to be much greater than previously appreciated.


Subject(s)
Evolution, Molecular , Glycine max/genetics , Retroelements , Base Sequence , DNA, Plant/chemistry , Gene Deletion , Genome, Plant , Genomics/methods , Long Interspersed Nucleotide Elements , Methylation , Mutagenesis, Insertional , Phaseolus/genetics , Phylogeny , Sequence Alignment , Sequence Analysis, DNA , Terminal Repeat Sequences
6.
Plant Physiol ; 148(4): 1740-59, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18842825

ABSTRACT

The genomes of most, if not all, flowering plants have undergone whole genome duplication events during their evolution. The impact of such polyploidy events is poorly understood, as is the fate of most duplicated genes. We sequenced an approximately 1 million-bp region in soybean (Glycine max) centered on the Rpg1-b disease resistance gene and compared this region with a region duplicated 10 to 14 million years ago. These two regions were also compared with homologous regions in several related legume species (a second soybean genotype, Glycine tomentella, Phaseolus vulgaris, and Medicago truncatula), which enabled us to determine how each of the duplicated regions (homoeologues) in soybean has changed following polyploidy. The biggest change was in retroelement content, with homoeologue 2 having expanded to 3-fold the size of homoeologue 1. Despite this accumulation of retroelements, over 77% of the duplicated low-copy genes have been retained in the same order and appear to be functional. This finding contrasts with recent analyses of the maize (Zea mays) genome, in which only about one-third of duplicated genes appear to have been retained over a similar time period. Fluorescent in situ hybridization revealed that the homoeologue 2 region is located very near a centromere. Thus, pericentromeric localization, per se, does not result in a high rate of gene inactivation, despite greatly accelerated retrotransposon accumulation. In contrast to low-copy genes, nucleotide-binding-leucine-rich repeat disease resistance gene clusters have undergone dramatic species/homoeologue-specific duplications and losses, with some evidence for partitioning of subfamilies between homoeologues.


Subject(s)
Evolution, Molecular , Gene Duplication , Genes, Plant , Glycine max/genetics , Polyploidy , Retroelements , Centromere/genetics , Chromosomes, Artificial, Bacterial , DNA, Plant/chemistry , Gene Deletion , Genome, Plant , Immunity, Innate/genetics , Multigene Family , Mutagenesis, Insertional , Phaseolus/genetics , Phylogeny , Plant Diseases/genetics , Sequence Analysis, DNA
7.
Theor Appl Genet ; 117(3): 449-58, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18504542

ABSTRACT

The small genome size (740 Mb), short life cycle (3 months) and high economic importance as a food crop legume make chickpea (Cicer arietinum L.) an important system for genomics research. Although several genetic linkage maps using various markers and genomic tools have become available, sequencing efforts and their use are limited in chickpea genomic research. In this study, we explored the genome organization of chickpea by sequencing approximately 500 kb from 11 BAC clones (three representing ascochyta blight resistance QTL1 (ABR-QTL1) and eight randomly selected BAC clones). Our analysis revealed that these sequenced chickpea genomic regions have a gene density of one per 9.2 kb, an average gene length of 2,500 bp, an average of 4.7 exons per gene, with an average exon and intron size of 401 and 316 bp, respectively, and approximately 8.6% repetitive elements. Other features analyzed included exon and intron length, number of exons per gene, protein length and %GC content. Although there are reports on high synteny among legume genomes, the microsynteny between the 500 kb chickpea and available Medicago truncatula genomic sequences varied depending on the region analyzed. The GBrowse-based annotation of these BACs is available at http://www.genome.ou.edu/plants_totals.html . We believe that our work provides significant information that supports a chickpea genome sequencing effort in the future.


Subject(s)
Base Pairing/genetics , Chromosomes, Artificial, Bacterial/genetics , Cicer/genetics , Genome, Plant/genetics , Synteny/genetics , Base Sequence , Clone Cells , Quantitative Trait Loci/genetics , Retroelements/genetics
8.
Plant Physiol ; 146(1): 5-21, 2008 Jan.
Article in English | MEDLINE | ID: mdl-17981990

ABSTRACT

The nucleotide-binding site (NBS)-Leucine-rich repeat (LRR) gene family accounts for the largest number of known disease resistance genes, and is one of the largest gene families in plant genomes. We have identified 333 nonredundant NBS-LRRs in the current Medicago truncatula draft genome (Mt1.0), likely representing 400 to 500 NBS-LRRs in the full genome, or roughly 3 times the number present in Arabidopsis (Arabidopsis thaliana). Although many characteristics of the gene family are similar to those described on other plant genomes, several evolutionary features are particularly pronounced in M. truncatula, including a high degree of clustering, evidence of significant numbers of ectopic translocations from clusters to other parts of the genome, a small number of more evolutionarily stable NBS-LRRs, and numerous truncations and fusions leading to novel domain compositions. The gene family clearly has had a large impact on the structure of the genome, both through ectopic translocations (potentially, a means of seeding new NBS-LRR clusters), and through two extraordinarily large superclusters. Chromosome 6 encodes approximately 34% of all TIR-NBS-LRRs, while chromosome 3 encodes approximately 40% of all coiled-coil-NBS-LRRs. Almost all atypical domain combinations are in the TIR-NBS-LRR subfamily, with many occurring within one genomic cluster. This analysis shows the gene family not only is important functionally and agronomically, but also plays a structural role in the genome.


Subject(s)
Genes, Plant/genetics , Medicago truncatula/genetics , Amino Acid Motifs , Cluster Analysis , Computers , DNA Transposable Elements , Expressed Sequence Tags , Gene Duplication , Gene Expression Regulation, Plant , Gene Library , Genome, Plant , Phylogeny , Plant Proteins/genetics , Promoter Regions, Genetic , Protein Structure, Tertiary , Pseudogenes/genetics
9.
BMC Genomics ; 8: 330, 2007 Sep 19.
Article in English | MEDLINE | ID: mdl-17880721

ABSTRACT

BACKGROUND: Soybean, Glycine max (L.) Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agriculture jointly announced the sequencing of the soybean genome. One of the initial concerns is to what extent sequence identity in homeologous regions would have on whole genome shotgun sequence assembly. RESULTS: Seventeen BACs representing approximately 2.03 Mb were sequenced as representative potential homeologous regions from the soybean genome. Genetic mapping of each BAC shows that 11 of the 20 chromosomes are represented. Sequence comparisons between homeologous BACs shows that the soybean genome is a mosaic of retained paleopolyploid regions. Some regions appear to be highly conserved while other regions have diverged significantly. Large-scale "batch" reassembly of all 17 BACs combined showed that even the most homeologous BACs with upwards of 95% sequence identity resolve into their respective homeologous sequences. Potential assembly errors were generated by tandemly duplicated pentatricopeptide repeat containing genes and long simple sequence repeats. Analysis of a whole-genome shotgun assembly of 80,000 randomly chosen JGI-DOE sequence traces reveals some new soybean-specific repeat sequences. CONCLUSION: This analysis investigated both the structure of the paleopolyploid soybean genome and the potential effects retained homeology will have on assembling the whole genome shotgun sequence. Based upon these results, homeologous regions similar to those characterized here will not cause major assembly issues.


Subject(s)
Genes, Duplicate/genetics , Genome, Plant/genetics , Glycine max/genetics , Physical Chromosome Mapping/methods , Polyploidy , Repetitive Sequences, Nucleic Acid , Sequence Analysis, DNA/methods , Base Sequence/genetics , Chromosomes, Artificial, Bacterial/genetics , Chromosomes, Plant/genetics , Evolution, Molecular , Genetic Markers , Microsatellite Repeats , Phylogeny , Polymorphism, Genetic/genetics , Software , Species Specificity , Synteny/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...