Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Cell ; 143(5): 837-47, 2010 Nov 24.
Article in English | MEDLINE | ID: mdl-21111241

ABSTRACT

Understanding the prevailing mutational mechanisms responsible for human genome structural variation requires uniformity in the discovery of allelic variants and precision in terms of breakpoint delineation. We develop a resource based on capillary end sequencing of 13.8 million fosmid clones from 17 human genomes and characterize the complete sequence of 1054 large structural variants corresponding to 589 deletions, 384 insertions, and 81 inversions. We analyze the 2081 breakpoint junctions and infer potential mechanism of origin. Three mechanisms account for the bulk of germline structural variation: microhomology-mediated processes involving short (2-20 bp) stretches of sequence (28%), nonallelic homologous recombination (22%), and L1 retrotransposition (19%). The high quality and long-range continuity of the sequence reveals more complex mutational mechanisms, including repeat-mediated inversions and gene conversion, that are most often missed by other methods, such as comparative genomic hybridization, single nucleotide polymorphism microarrays, and next-generation sequencing.


Subject(s)
Genome, Human , Genomic Structural Variation , Mutation , Base Sequence , Gene Conversion , Humans , Molecular Sequence Data , Sequence Analysis, DNA
2.
Nature ; 453(7191): 56-64, 2008 May 01.
Article in English | MEDLINE | ID: mdl-18451855

ABSTRACT

Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale--particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation--a standard for genotyping platforms and a prelude to future individual genome sequencing projects.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Physical Chromosome Mapping , Sequence Analysis, DNA , Chromosome Inversion/genetics , Euchromatin/genetics , Gene Deletion , Geography , Haplotypes , Humans , Mutagenesis, Insertional/genetics , Polymorphism, Single Nucleotide/genetics , Racial Groups/genetics , Reproducibility of Results
3.
PLoS Genet ; 3(4): e63, 2007 Apr 20.
Article in English | MEDLINE | ID: mdl-17447845

ABSTRACT

The APOBEC3 gene family plays a role in innate cellular immunity inhibiting retroviral infection, hepatitis B virus propagation, and the retrotransposition of endogenous elements. We present a detailed sequence and population genetic analysis of a 29.5-kb common human deletion polymorphism that removes the APOBEC3B gene. We developed a PCR-based genotyping assay, characterized 1,277 human diversity samples, and found that the frequency of the deletion allele varies significantly among major continental groups (global FST = 0.2843). The deletion is rare in Africans and Europeans (frequency of 0.9% and 6%), more common in East Asians and Amerindians (36.9% and 57.7%), and almost fixed in Oceanic populations (92.9%). Despite a worldwide frequency of 22.5%, analysis of data from the International HapMap Project reveals that no single existing tag single nucleotide polymorphism may serve as a surrogate for the deletion variant, emphasizing that without careful analysis its phenotypic impact may be overlooked in association studies. Application of haplotype-based tests for selection revealed potential pitfalls in the direct application of existing methods to the analysis of genomic structural variation. These data emphasize the importance of directly genotyping structural variation in association studies and of accurately resolving variant breakpoints before proceeding with more detailed population-genetic analysis.


Subject(s)
Cytidine Deaminase/genetics , Gene Deletion , Genetics, Population , Polymorphism, Genetic , Gene Frequency , Genotype , Geography , Homozygote , Humans , Linkage Disequilibrium , Minor Histocompatibility Antigens , Molecular Sequence Data
4.
Am J Hum Genet ; 79(2): 275-90, 2006 Aug.
Article in English | MEDLINE | ID: mdl-16826518

ABSTRACT

Studies of copy-number variation and linkage disequilibrium (LD) have typically excluded complex regions of the genome that are rich in duplications and prone to rearrangement. In an attempt to assess the heritability and LD of copy-number polymorphisms (CNPs) in duplication-rich regions of the genome, we profiled copy-number variation in 130 putative "rearrangement hotspot regions" among 269 individuals of European, Yoruba, Chinese, and Japanese ancestry analyzed by the International HapMap Consortium. Eighty-four hotspot regions, corresponding to 257 bacterial artificial chromosome (BAC) probes, showed evidence of copy-number differences. Despite a predisposing genetic architecture, no polymorphism was ever observed in the remaining 46 "rearrangement hotspots," and we suggest these represent excellent candidate sites for pathogenic rearrangements. We used a combination of BAC-based and high-density customized oligonucleotide arrays to resolve the molecular basis of structural rearrangements. For common variants (frequency >10%), we observed a distinct bias against copy-number losses, suggesting that deletions are subject to purifying selection. Heritability estimates did not differ significantly from 1.0 among the majority (30 of 34) of loci analyzed, consistent with normal Mendelian inheritance. Some of the CNPs in duplication-rich regions showed strong LD with nearby single-nucleotide polymorphisms (SNPs) and were observed to segregate on ancestral SNP haplotypes. However, LD with the best available SNP markers was weaker than has been reported for deletion polymorphisms in less complex regions of the genome. These observations may be accounted for by a low density of SNP data in duplicated regions, challenges in mapping and typing the CNPs, and the possibility that CNPs in these regions have rearranged on multiple haplotype backgrounds. Our results underscore the need for complete maps of genetic variation in duplication-rich regions of the genome.


Subject(s)
Gene Dosage , Gene Duplication , Genome, Human , Linkage Disequilibrium , Polymorphism, Genetic , Repetitive Sequences, Nucleic Acid , Gene Rearrangement , Humans , Polymorphism, Single Nucleotide
5.
Hum Mol Genet ; 15(7): 1159-67, 2006 Apr 01.
Article in English | MEDLINE | ID: mdl-16497726

ABSTRACT

The contribution of large-scale and intermediate-size structural variation (ISV) to human genetic disease and disease susceptibility is only beginning to be understood. The development of high-throughput genotyping technologies is one of the most critical aspects for future studies of linkage disequilibrium (LD) and disease association. Using a simple PCR-based method designed to assay the junctions of the breakpoints, we genotyped seven simple insertion and deletion polymorphisms ranging in size from 6.3 to 24.7 kb among 90 CEPH individuals. We then extended this analysis to a larger collection of samples (n=460) by application of an oligonucleotide extension-ligation genotyping assay. The analysis showed a high level of concordance ( approximately 99%) when compared with PCR/sequence-validated genotypes. Using the available HapMap data, we observed significant LD (r2=0.74-0.95) between each ISV and flanking single nucleotide polymorphisms, but this observation is likely to hold only for similar simple insertion/deletion events. The approach we describe may be used to characterize a large number of individuals in a cost-effective manner once the sequence organization of ISVs is known.


Subject(s)
Genetic Testing/methods , Genotype , Cohort Studies , Female , Genetic Variation , Humans , Linkage Disequilibrium , Male , Microarray Analysis , Models, Genetic , Polymorphism, Single Nucleotide
6.
Genome Res ; 15(10): 1344-56, 2005 Oct.
Article in English | MEDLINE | ID: mdl-16169929

ABSTRACT

Structural changes (deletions, insertions, and inversions) between human and chimpanzee genomes have likely had a significant impact on lineage-specific evolution because of their potential for dramatic and irreversible mutation. The low-quality nature of the current chimpanzee genome assembly precludes the reliable identification of many of these differences. To circumvent this, we applied a method to optimally map chimpanzee fosmid paired-end sequences against the human genome to systematically identify sites of structural variation > or = 12 kb between the two species. Our analysis yielded a total of 651 putative sites of chimpanzee deletion (n = 293), insertions (n = 184), and rearrangements consistent with local inversions between the two genomes (n = 174). We validated a subset (19/23) of insertion and deletions using PCR and Southern blot assays, confirming the accuracy of our method. The events are distributed throughout the genome on all chromosomes but are highly correlated with sites of segmental duplication in human and chimpanzee. These structural variants encompass at least 24 Mb of DNA and overlap with > 245 genes. Seventeen of these genes contain exons missing in the chimpanzee genomic sequence and also show a significant reduction in gene expression in chimpanzee. Compared with the pioneering work of Yunis, Prakash, Dutrillaux, and Lejeune, this analysis expands the number of potential rearrangements between chimpanzees and humans 50-fold. Furthermore, this work prioritizes regions for further finishing in the chimpanzee genome and provides a resource for interrogating functional differences between humans and chimpanzees.


Subject(s)
Genome , Pan troglodytes/genetics , Animals , Data Collection , Humans , Oligonucleotide Array Sequence Analysis , Reverse Transcriptase Polymerase Chain Reaction , Sequence Deletion
7.
Genome Res ; 14(4): 603-8, 2004 Apr.
Article in English | MEDLINE | ID: mdl-15060001

ABSTRACT

The mouse V1R putative pheromone receptor gene family consists of at least 137 intact genes clustered at multiple chromosomal locations in the genome. Species-specific pheromone receptor repertoires may partly explain species-specific social behavior. We conducted a genomic analysis of an orthologous pair of mouse and rat V1R gene clusters to test for species specificity in rodent pheromone systems. Mouse and rat have lineage-specific V1R repertoires in each of three major subfamilies at these loci as a result of postspeciation duplications, gene loss, and gene conversions. The onset of this diversification roughly coincides with a wave of Line1 (L1) retrotranspositions into the two loci. We propose that L1 activity has facilitated postspeciation V1R duplications and gene conversions. In addition, we find extensive homology among putative V1R promoter regions in both species. We propose a regulatory model in which promoter homogenization could ensure that V1R genes are equally competitive for a limiting transcriptional structure to account for mutually exclusive V1R expression in vomeronasal neurons.


Subject(s)
Multigene Family/genetics , Receptors, Pheromone/genetics , Animals , Chromosome Mapping/methods , Computational Biology/statistics & numerical data , Databases, Genetic , Likelihood Functions , Mice , Phylogeny , Rats , Species Specificity , Synteny/genetics
8.
Genome Res ; 13(5): 781-93, 2003 May.
Article in English | MEDLINE | ID: mdl-12727898

ABSTRACT

Large segmental duplications (SDs) constitute at least 3.6% of the human genome and have increased its size, complexity, and diversity. SDs can mediate ectopic sequence exchange resulting in gross chromosomal rearrangements that could contribute to speciation and disease. We have identified and evaluated a subset of human SDs that harbor an 88-member subfamily of olfactory receptor (OR)-like genes called the 7Es. At least 92% of these genes appear to be pseudogenes when compared to other OR genes. The 7E-containing SDs (7E SDs) have duplicated to at least 35 regions of the genome via intra- and interchromosomal duplication events. In contrast to many human SDs, the 7E SDs are not biased towards pericentromeric or subtelomeric regions. We find evidence for gene conversion among 7E genes and larger sequence exchange between 7E SDs, supporting the hypothesis that long, highly similar stretches of DNA facilitate ectopic interactions. The complex structure and history of the 7E SDs necessitates extension of the current model of large-scale DNA duplication. Despite their appearance as pseudogenes, some 7E genes exhibit a signature of purifying selection, and at least one 7E gene is expressed.


Subject(s)
Evolution, Molecular , Gene Duplication , Genome, Human , Receptors, Odorant/genetics , Amino Acid Sequence/genetics , Animals , Chromosome Mapping/methods , Chromosome Mapping/statistics & numerical data , Chromosomes, Human, Pair 19/genetics , Computational Biology/methods , Databases, Genetic/statistics & numerical data , Gene Conversion/genetics , Genetic Markers/genetics , Humans , In Situ Hybridization, Fluorescence/methods , In Situ Hybridization, Fluorescence/statistics & numerical data , Mice , Molecular Sequence Data , Multigene Family/genetics , Open Reading Frames/genetics , Phylogeny , Pseudogenes , Selection, Genetic
9.
Genome Res ; 12(11): 1663-72, 2002 Nov.
Article in English | MEDLINE | ID: mdl-12421752

ABSTRACT

Various portions of the region surrounding the site where two ancestral chromosomes fused to form human chromosome 2 are duplicated elsewhere in the human genome, primarily in subtelomeric and pericentromeric locations. At least 24 potentially functional genes and 16 pseudogenes reside in the 614-kb of sequence surrounding the fusion site and paralogous segments on other chromosomes. By comparing the sequences of genomic copies and transcripts, we show that at least 18 of the genes in these paralogous regions are transcriptionally active. Among these genes are new members of the cobalamin synthetase W domain (CBWD) and forkhead domain FOXD4 gene families. Copies of RPL23A and SNRPA1 on chromosome 2 are retrotransposed-processed pseudogenes that were included in segmental duplications; we find 53 RPL23A pseudogenes in the human genome and map the functional copy of SNRPA1 to 15qter. The draft sequence of the human genome also provides new information on the location and intron-exon structure of functional copies of other 2q-fusion genes (PGM5, retina-specific F379, helicase CHLR1, and acrosin). This study illustrates that the duplication and rearrangement of subtelomeric and pericentromeric regions have functional relevance to human biology; these processes can change gene dosage and/or generate genes with new functions.


Subject(s)
Chromosomes, Human, Pair 2/chemistry , Chromosomes, Human, Pair 2/physiology , Evolution, Molecular , Genes/genetics , Phosphoglucomutase , Sequence Homology, Nucleic Acid , Translocation, Genetic/genetics , Translocation, Genetic/physiology , Amino Acid Sequence/genetics , Base Sequence/genetics , Centromere/genetics , Cytoskeletal Proteins/genetics , DNA-Binding Proteins/genetics , Forkhead Transcription Factors , Gene Duplication , Humans , Molecular Sequence Data , Multigene Family/genetics , Nitrogenous Group Transferases/genetics , Organ Specificity/genetics , Protein Structure, Tertiary/genetics , Protein Structure, Tertiary/physiology , Pseudogenes/genetics , Retina/chemistry , Retina/metabolism , Ribonucleoproteins, Small Nuclear/genetics , Ribosomal Proteins/genetics , Trans-Activators/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...