Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Genet Med ; 25(6): 100830, 2023 06.
Article in English | MEDLINE | ID: mdl-36939041

ABSTRACT

PURPOSE: The analysis of exome and genome sequencing data for the diagnosis of rare diseases is challenging and time-consuming. In this study, we evaluated an artificial intelligence model, based on machine learning for automating variant prioritization for diagnosing rare genetic diseases in the Baylor Genetics clinical laboratory. METHODS: The automated analysis model was developed using a supervised learning approach based on thousands of manually curated variants. The model was evaluated on 2 cohorts. The model accuracy was determined using a retrospective cohort comprising 180 randomly selected exome cases (57 singletons, 123 trios); all of which were previously diagnosed and solved through manual interpretation. Diagnostic yield with the modified workflow was estimated using a prospective "production" cohort of 334 consecutive clinical cases. RESULTS: The model accurately pinpointed all manually reported variants as candidates. The reported variants were ranked in top 10 candidate variants in 98.4% (121/123) of trio cases, in 93.0% (53/57) of single proband cases, and 96.7% (174/180) of all cases. The accuracy of the model was reduced in some cases because of incomplete variant calling (eg, copy number variants) or incomplete phenotypic description. CONCLUSION: The automated model for case analysis assists clinical genetic laboratories in prioritizing candidate variants effectively. The use of such technology may facilitate the interpretation of genomic data for a large number of patients in the era of precision medicine.


Subject(s)
Laboratories, Clinical , Rare Diseases , Humans , Rare Diseases/diagnosis , Rare Diseases/genetics , Laboratories , Artificial Intelligence , Retrospective Studies , Prospective Studies , Exome/genetics
2.
Hum Genomics ; 15(1): 72, 2021 12 20.
Article in English | MEDLINE | ID: mdl-34930489

ABSTRACT

BACKGROUND: Due to the limitations of the current routine diagnostic methods, low-level somatic mosaicism with variant allele fraction (VAF) < 10% is often undetected in clinical settings. To date, only a few studies have attempted to analyze tissue distribution of low-level parental mosaicism in a large clinical exome sequencing (ES) cohort. METHODS: Using a customized bioinformatics pipeline, we analyzed apparent de novo single-nucleotide variants or indels identified in the affected probands in ES trio data at Baylor Genetics clinical laboratories. Clinically relevant variants with VAFs between 30 and 70% in probands and lower than 10% in one parent were studied. DNA samples extracted from saliva, buccal cells, redrawn peripheral blood, urine, hair follicles, and nail, representing all three germ layers, were tested using PCR amplicon next-generation sequencing (amplicon NGS) and droplet digital PCR (ddPCR). RESULTS: In a cohort of 592 clinical ES trios, we found 61 trios, each with one parent suspected of low-level mosaicism. In 21 parents, the variants were validated using amplicon NGS and seven of them by ddPCR in peripheral blood DNA samples. The parental VAFs in blood samples varied between 0.08 and 9%. The distribution of VAFs in additional tissues ranged from 0.03% in hair follicles to 9% in re-drawn peripheral blood. CONCLUSIONS: Our study illustrates the importance of analyzing ES data using sensitive computational and molecular methods for low-level parental somatic mosaicism for clinically relevant variants previously diagnosed in routine clinical diagnostics as apparent de novo.


Subject(s)
Exome , Mosaicism , Exome/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Mouth Mucosa , Parents , Exome Sequencing
4.
Nat Genet ; 50(8): 1140-1150, 2018 08.
Article in English | MEDLINE | ID: mdl-29988122

ABSTRACT

Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, consistent with the three-dimensional chromatin organization measured by in situ Hi-C in T cells. Fifteen percent of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak (local-ATAC-QTLs). Local-ATAC-QTLs have the largest effects on co-accessible peaks, are associated with gene expression and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression.


Subject(s)
CD4-Positive T-Lymphocytes/physiology , Chromatin/genetics , Polymorphism, Single Nucleotide , Adult , Autoimmune Diseases/genetics , Female , Gene Expression Regulation , Genotype , Humans , Male , Regulatory Sequences, Nucleic Acid
5.
Cell ; 173(5): 1165-1178.e20, 2018 05 17.
Article in English | MEDLINE | ID: mdl-29706548

ABSTRACT

Cohesin extrusion is thought to play a central role in establishing the architecture of mammalian genomes. However, extrusion has not been visualized in vivo, and thus, its functional impact and energetics are unknown. Using ultra-deep Hi-C, we show that loop domains form by a process that requires cohesin ATPases. Once formed, however, loops and compartments are maintained for hours without energy input. Strikingly, without ATP, we observe the emergence of hundreds of CTCF-independent loops that link regulatory DNA. We also identify architectural "stripes," where a loop anchor interacts with entire domains at high frequency. Stripes often tether super-enhancers to cognate promoters, and in B cells, they facilitate Igh transcription and recombination. Stripe anchors represent major hotspots for topoisomerase-mediated lesions, which promote chromosomal translocations and cancer. In plasmacytomas, stripes can deregulate Igh-translocated oncogenes. We propose that higher organisms have coopted cohesin extrusion to enhance transcription and recombination, with implications for tumor development.


Subject(s)
Adenosine Triphosphate/metabolism , Cell Cycle Proteins/metabolism , Chromosomal Proteins, Non-Histone/metabolism , Genome , Animals , B-Lymphocytes/cytology , B-Lymphocytes/metabolism , CCCTC-Binding Factor/genetics , CCCTC-Binding Factor/metabolism , Cell Cycle Proteins/chemistry , Cell Cycle Proteins/genetics , Cell Line , Chondroitin Sulfate Proteoglycans/genetics , Chondroitin Sulfate Proteoglycans/metabolism , Chromatin/metabolism , Chromosomal Proteins, Non-Histone/chemistry , Chromosomal Proteins, Non-Histone/genetics , Chromosomes/metabolism , DNA-Binding Proteins , Humans , Mice , Mutagenesis , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Phosphoproteins/genetics , Phosphoproteins/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism , Transcription, Genetic , Cohesins
6.
BMC Biol ; 15(1): 110, 2017 Nov 16.
Article in English | MEDLINE | ID: mdl-29145861

ABSTRACT

BACKGROUND: The de novo assembly of repeat-rich mammalian genomes using only high-throughput short read sequencing data typically results in highly fragmented genome assemblies that limit downstream applications. Here, we present an iterative approach to hybrid de novo genome assembly that incorporates datasets stemming from multiple genomic technologies and methods. We used this approach to improve the gray mouse lemur (Microcebus murinus) genome from early draft status to a near chromosome-scale assembly. METHODS: We used a combination of advanced genomic technologies to iteratively resolve conflicts and super-scaffold the M. murinus genome. RESULTS: We improved the M. murinus genome assembly to a scaffold N50 of 93.32 Mb. Whole genome alignments between our primary super-scaffolds and 23 human chromosomes revealed patterns that are congruent with historical comparative cytogenetic data, thus demonstrating the accuracy of our de novo scaffolding approach and allowing assignment of scaffolds to M. murinus chromosomes. Moreover, we utilized our independent datasets to discover and characterize sequences associated with centromeres across the mouse lemur genome. Quality assessment of the final assembly found 96% of mouse lemur canonical transcripts nearly complete, comparable to other published high-quality reference genome assemblies. CONCLUSIONS: We describe a new assembly of the gray mouse lemur (Microcebus murinus) genome with chromosome-scale scaffolds produced using a hybrid bioinformatic and sequencing approach. The approach is cost effective and produces superior results based on metrics of contiguity and completeness. Our results show that emerging genomic technologies can be used in combination to characterize centromeres of non-model species and to produce accurate de novo chromosome-scale genome assemblies of complex mammalian genomes.


Subject(s)
Centromere/genetics , Cheirogaleidae/genetics , Genome , Animals , Computational Biology , Female , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
7.
Mol Cell ; 67(6): 1037-1048.e6, 2017 Sep 21.
Article in English | MEDLINE | ID: mdl-28890333

ABSTRACT

The three-dimensional arrangement of the human genome comprises a complex network of structural and regulatory chromatin loops important for coordinating changes in transcription during human development. To better understand the mechanisms underlying context-specific 3D chromatin structure and transcription during cellular differentiation, we generated comprehensive in situ Hi-C maps of DNA loops in human monocytes and differentiated macrophages. We demonstrate that dynamic looping events are regulatory rather than structural in nature and uncover widespread coordination of dynamic enhancer activity at preformed and acquired DNA loops. Enhancer-bound loop formation and enhancer activation of preformed loops together form multi-loop activation hubs at key macrophage genes. Activation hubs connect 3.4 enhancers per promoter and exhibit a strong enrichment for activator protein 1 (AP-1)-binding events, suggesting that multi-loop activation hubs involving cell-type-specific transcription factors represent an important class of regulatory chromatin structures for the spatiotemporal control of transcription.


Subject(s)
Cell Differentiation , Chromatin Assembly and Disassembly , Chromatin/metabolism , DNA/metabolism , Macrophages/metabolism , Transcription Factor AP-1/metabolism , Transcription, Genetic , Binding Sites , Cell Line, Tumor , Chromatin/chemistry , Chromatin/genetics , DNA/chemistry , DNA/genetics , Enhancer Elements, Genetic , Gene Expression Regulation , High-Throughput Nucleotide Sequencing , Humans , Nucleic Acid Conformation , Phenotype , Protein Binding , Time Factors , Transcription Factor AP-1/genetics
8.
Science ; 356(6333): 92-95, 2017 04 07.
Article in English | MEDLINE | ID: mdl-28336562

ABSTRACT

The Zika outbreak, spread by the Aedes aegypti mosquito, highlights the need to create high-quality assemblies of large genomes in a rapid and cost-effective way. Here we combine Hi-C data with existing draft assemblies to generate chromosome-length scaffolds. We validate this method by assembling a human genome, de novo, from short reads alone (67× coverage). We then combine our method with draft sequences to create genome assemblies of the mosquito disease vectors Aeaegypti and Culex quinquefasciatus, each consisting of three scaffolds corresponding to the three chromosomes in each species. These assemblies indicate that almost all genomic rearrangements among these species occur within, rather than between, chromosome arms. The genome assembly procedure we describe is fast, inexpensive, and accurate, and can be applied to many species.


Subject(s)
Aedes/genetics , Contig Mapping/methods , Genome, Insect , Animals , Conserved Sequence , Culex/genetics , Gene Rearrangement , Humans , Nucleic Acid Conformation
9.
Proc Natl Acad Sci U S A ; 113(31): E4504-12, 2016 08 02.
Article in English | MEDLINE | ID: mdl-27432957

ABSTRACT

During interphase, the inactive X chromosome (Xi) is largely transcriptionally silent and adopts an unusual 3D configuration known as the "Barr body." Despite the importance of X chromosome inactivation, little is known about this 3D conformation. We recently showed that in humans the Xi chromosome exhibits three structural features, two of which are not shared by other chromosomes. First, like the chromosomes of many species, Xi forms compartments. Second, Xi is partitioned into two huge intervals, called "superdomains," such that pairs of loci in the same superdomain tend to colocalize. The boundary between the superdomains lies near DXZ4, a macrosatellite repeat whose Xi allele extensively binds the protein CCCTC-binding factor. Third, Xi exhibits extremely large loops, up to 77 megabases long, called "superloops." DXZ4 lies at the anchor of several superloops. Here, we combine 3D mapping, microscopy, and genome editing to study the structure of Xi, focusing on the role of DXZ4 We show that superloops and superdomains are conserved across eutherian mammals. By analyzing ligation events involving three or more loci, we demonstrate that DXZ4 and other superloop anchors tend to colocate simultaneously. Finally, we show that deleting DXZ4 on Xi leads to the disappearance of superdomains and superloops, changes in compartmentalization patterns, and changes in the distribution of chromatin marks. Thus, DXZ4 is essential for proper Xi packaging.


Subject(s)
Chromosomes, Human, X/genetics , Gene Deletion , Genome, Human/genetics , Microsatellite Repeats/genetics , X Chromosome Inactivation , Animals , Binding Sites/genetics , CCCTC-Binding Factor/metabolism , Chromatin/genetics , Chromatin/metabolism , Chromosome Mapping , Female , Humans , Macaca mulatta , Mice , Protein Binding
10.
Cell Syst ; 3(1): 95-8, 2016 07.
Article in English | MEDLINE | ID: mdl-27467249

ABSTRACT

Hi-C experiments explore the 3D structure of the genome, generating terabases of data to create high-resolution contact maps. Here, we introduce Juicer, an open-source tool for analyzing terabase-scale Hi-C datasets. Juicer allows users without a computational background to transform raw sequence data into normalized contact maps with one click. Juicer produces a hic file containing compressed contact matrices at many resolutions, facilitating visualization and analysis at multiple scales. Structural features, such as loops and domains, are automatically annotated. Juicer is available as open source software at http://aidenlab.org/juicer/.


Subject(s)
Genome , Algorithms , Computational Biology , Software
11.
Cell Syst ; 3(1): 99-101, 2016 07.
Article in English | MEDLINE | ID: mdl-27467250

ABSTRACT

Hi-C experiments study how genomes fold in 3D, generating contact maps containing features as small as 20 bp and as large as 200 Mb. Here we introduce Juicebox, a tool for exploring Hi-C and other contact map data. Juicebox allows users to zoom in and out of Hi-C maps interactively, just as a user of Google Earth might zoom in and out of a geographic map. Maps can be compared to one another, or to 1D tracks or 2D feature sets.


Subject(s)
Genome , Humans , Software
12.
Cell ; 159(7): 1665-80, 2014 Dec 18.
Article in English | MEDLINE | ID: mdl-25497547

ABSTRACT

We use in situ Hi-C to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types. The densest, in human lymphoblastoid cells, contains 4.9 billion contacts, achieving 1 kb resolution. We find that genomes are partitioned into contact domains (median length, 185 kb), which are associated with distinct patterns of histone marks and segregate into six subcompartments. We identify ∼10,000 loops. These loops frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species. Loop anchors typically occur at domain boundaries and bind CTCF. CTCF sites at loop anchors occur predominantly (>90%) in a convergent orientation, with the asymmetric motifs "facing" one another. The inactive X chromosome splits into two massive domains and contains large loops anchored at CTCF-binding repeats.


Subject(s)
Cell Nucleus/genetics , Chromatin/chemistry , Genome, Human , Animals , CCCTC-Binding Factor , Cell Line , Cell Nucleus/chemistry , Gene Expression Regulation , Histone Code , Humans , Mice , Molecular Conformation , Regulatory Sequences, Nucleic Acid , Repressor Proteins/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...