Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 70
Filter
1.
N Engl J Med ; 390(21): 1985-1997, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38838312

ABSTRACT

BACKGROUND: Genetic variants that cause rare disorders may remain elusive even after expansive testing, such as exome sequencing. The diagnostic yield of genome sequencing, particularly after a negative evaluation, remains poorly defined. METHODS: We sequenced and analyzed the genomes of families with diverse phenotypes who were suspected to have a rare monogenic disease and for whom genetic testing had not revealed a diagnosis, as well as the genomes of a replication cohort at an independent clinical center. RESULTS: We sequenced the genomes of 822 families (744 in the initial cohort and 78 in the replication cohort) and made a molecular diagnosis in 218 of 744 families (29.3%). Of the 218 families, 61 (28.0%) - 8.2% of families in the initial cohort - had variants that required genome sequencing for identification, including coding variants, intronic variants, small structural variants, copy-neutral inversions, complex rearrangements, and tandem repeat expansions. Most families in which a molecular diagnosis was made after previous nondiagnostic exome sequencing (63.5%) had variants that could be detected by reanalysis of the exome-sequence data (53.4%) or by additional analytic methods, such as copy-number variant calling, to exome-sequence data (10.8%). We obtained similar results in the replication cohort: in 33% of the families in which a molecular diagnosis was made, or 8% of the cohort, genome sequencing was required, which showed the applicability of these findings to both research and clinical environments. CONCLUSIONS: The diagnostic yield of genome sequencing in a large, diverse research cohort and in a small clinical cohort of persons who had previously undergone genetic testing was approximately 8% and included several types of pathogenic variation that had not previously been detected by means of exome sequencing or other techniques. (Funded by the National Human Genome Research Institute and others.).


Subject(s)
Genetic Variation , Rare Diseases , Whole Genome Sequencing , Female , Humans , Male , Cohort Studies , Exome , Exome Sequencing , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/ethnology , Genetic Diseases, Inborn/genetics , Genetic Testing , Genome, Human , Phenotype , Rare Diseases/diagnosis , Rare Diseases/ethnology , Rare Diseases/genetics , Sequence Analysis, DNA , Child , Adolescent , Young Adult , Adult
2.
bioRxiv ; 2024 Apr 29.
Article in English | MEDLINE | ID: mdl-38746320

ABSTRACT

Pediatric solid tumors are rare malignancies that represent a leading cause of death by disease among children in developed countries. The early age-of-onset of these tumors suggests that germline genetic factors are involved, yet conventional germline testing for short coding variants in established predisposition genes only identifies pathogenic events in 10-15% of patients. Here, we examined the role of germline structural variants (SVs)-an underexplored form of germline variation-in pediatric extracranial solid tumors using germline genome sequencing of 1,766 affected children, their 943 unaffected relatives, and 6,665 adult controls. We discovered a sex-biased association between very large (>1 megabase) germline chromosomal abnormalities and a four-fold increased risk of solid tumors in male children. The overall impact of germline SVs was greatest in neuroblastoma, where we revealed burdens of ultra-rare SVs that cause loss-of-function of highly expressed, mutationally intolerant, neurodevelopmental genes, as well as noncoding SVs predicted to disrupt three-dimensional chromatin domains in neural crest-derived tissues. Collectively, our results implicate rare germline SVs as a predisposing factor to pediatric solid tumors that may guide future studies and clinical practice.

3.
Am J Hum Genet ; 111(5): 863-876, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38565148

ABSTRACT

Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and, with new innovative methods, can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the Genomics Research to Elucidate the Genetics of Rare Diseases consortium and analyzed using the seqr platform. The addition of CNV detection to exome analysis identified causal CNVs for 171 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb. The causal CNVs consisted of 140 deletions, 15 duplications, 3 suspected complex structural variants (SVs), 3 insertions, and 10 complex SVs, the latter two groups being identified by orthogonal confirmation methods. To classify CNV variant pathogenicity, we used the 2020 American College of Medical Genetics and Genomics/ClinGen CNV interpretation standards and developed additional criteria to evaluate allelic and functional data as well as variants on the X chromosome to further advance the framework. We interpreted 151 CNVs as likely pathogenic/pathogenic and 20 CNVs as high-interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher-resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.


Subject(s)
DNA Copy Number Variations , Exome Sequencing , Exome , Rare Diseases , Humans , DNA Copy Number Variations/genetics , Rare Diseases/genetics , Rare Diseases/diagnosis , Exome/genetics , Male , Female , Cohort Studies , Genetic Testing/methods
4.
medRxiv ; 2024 Mar 26.
Article in English | MEDLINE | ID: mdl-38585811

ABSTRACT

Purpose: To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). Methods: We coupled phenotyping with exome or genome sequencing of 467 pedigrees with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. Results: Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. Conclusion: This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.

6.
Genet Med ; 26(5): 101076, 2024 May.
Article in English | MEDLINE | ID: mdl-38258669

ABSTRACT

PURPOSE: Genome sequencing (GS)-specific diagnostic rates in prospective tightly ascertained exome sequencing (ES)-negative intellectual disability (ID) cohorts have not been reported extensively. METHODS: ES, GS, epigenetic signatures, and long-read sequencing diagnoses were assessed in 74 trios with at least moderate ID. RESULTS: The ES diagnostic yield was 42 of 74 (57%). GS diagnoses were made in 9 of 32 (28%) ES-unresolved families. Repeated ES with a contemporary pipeline on the GS-diagnosed families identified 8 of 9 single-nucleotide variations/copy-number variations undetected in older ES, confirming a GS-unique diagnostic rate of 1 in 32 (3%). Episignatures contributed diagnostic information in 9% with GS corroboration in 1 of 32 (3%) and diagnostic clues in 2 of 32 (6%). A genetic etiology for ID was detected in 51 of 74 (69%) families. Twelve candidate disease genes were identified. Contemporary ES followed by GS cost US$4976 (95% CI: $3704; $6969) per diagnosis and first-line GS at a cost of $7062 (95% CI: $6210; $8475) per diagnosis. CONCLUSION: Performing GS only in ID trios would be cost equivalent to ES if GS were available at $2435, about a 60% reduction from current prices. This study demonstrates that first-line GS achieves higher diagnostic rate than contemporary ES but at a higher cost.


Subject(s)
Exome Sequencing , Exome , Intellectual Disability , Humans , Intellectual Disability/genetics , Intellectual Disability/diagnosis , Male , Female , Exome/genetics , Exome Sequencing/economics , Cohort Studies , Genetic Testing/economics , Genetic Testing/methods , Whole Genome Sequencing/economics , Child , Genome, Human/genetics , DNA Copy Number Variations/genetics , Polymorphism, Single Nucleotide/genetics , Child, Preschool
7.
medRxiv ; 2023 Oct 05.
Article in English | MEDLINE | ID: mdl-37873196

ABSTRACT

Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and with new innovative methods can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the GREGoR consortium. Each family's CNV data was analyzed using the seqr platform and candidate CNVs classified using the 2020 ACMG/ClinGen CNV interpretation standards. We developed additional evidence criteria to address situations not covered by the current standards. The addition of CNV calling to exome analysis identified causal CNVs for 173 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb with estimates that 44% would not have been detected by standard chromosomal microarrays. The causal CNVs consisted of 141 deletions, 15 duplications, 4 suspected complex structural variants (SVs), 3 insertions and 10 complex SVs, the latter two groups being identified by orthogonal validation methods. We interpreted 153 CNVs as likely pathogenic/pathogenic and 20 CNVs as high interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.

8.
Nat Genet ; 55(9): 1589-1597, 2023 09.
Article in English | MEDLINE | ID: mdl-37604963

ABSTRACT

Copy number variants (CNVs) are major contributors to genetic diversity and disease. While standardized methods, such as the genome analysis toolkit (GATK), exist for detecting short variants, technical challenges have confounded uniform large-scale CNV analyses from whole-exome sequencing (WES) data. Given the profound impact of rare and de novo coding CNVs on genome organization and human disease, we developed GATK-gCNV, a flexible algorithm to discover rare CNVs from sequencing read-depth information, complete with open-source distribution via GATK. We benchmarked GATK-gCNV in 7,962 exomes from individuals in quartet families with matched genome sequencing and microarray data, finding up to 95% recall of rare coding CNVs at a resolution of more than two exons. We used GATK-gCNV to generate a reference catalog of rare coding CNVs in WES data from 197,306 individuals in the UK Biobank, and observed strong correlations between per-gene CNV rates and measures of mutational constraint, as well as rare CNV associations with multiple traits. In summary, GATK-gCNV is a tunable approach for sensitive and specific CNV discovery in WES data, with broad applications.


Subject(s)
DNA Copy Number Variations , Exome , Humans , Exome/genetics , Exome Sequencing , DNA Copy Number Variations/genetics , Chromosome Mapping , Exons
9.
Am J Hum Genet ; 110(8): 1229-1248, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37541186

ABSTRACT

Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order, and emerging technologies, such as optical genome mapping and long-read DNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to research consortia focused on elucidating the underlying cause of rare unsolved genetic disorders.


Subject(s)
Exome , Genetic Testing , Humans , Exome/genetics , Sequence Analysis, DNA , Phenotype , Exome Sequencing , Rare Diseases
10.
Am J Hum Genet ; 110(8): 1343-1355, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37541188

ABSTRACT

Despite significant progress in unraveling the genetic causes of neurodevelopmental disorders (NDDs), a substantial proportion of individuals with NDDs remain without a genetic diagnosis after microarray and/or exome sequencing. Here, we aimed to assess the power of short-read genome sequencing (GS), complemented with long-read GS, to identify causal variants in participants with NDD from the National Institute for Health and Care Research (NIHR) BioResource project. Short-read GS was conducted on 692 individuals (489 affected and 203 unaffected relatives) from 465 families. Additionally, long-read GS was performed on five affected individuals who had structural variants (SVs) in technically challenging regions, had complex SVs, or required distal variant phasing. Causal variants were identified in 36% of affected individuals (177/489), and a further 23% (112/489) had a variant of uncertain significance after multiple rounds of re-analysis. Among all reported variants, 88% (333/380) were coding nuclear SNVs or insertions and deletions (indels), and the remainder were SVs, non-coding variants, and mitochondrial variants. Furthermore, long-read GS facilitated the resolution of challenging SVs and invalidated variants of difficult interpretation from short-read GS. This study demonstrates the value of short-read GS, complemented with long-read GS, in investigating the genetic causes of NDDs. GS provides a comprehensive and unbiased method of identifying all types of variants throughout the nuclear and mitochondrial genomes in individuals with NDD.


Subject(s)
Genome, Human , Neurodevelopmental Disorders , Humans , Genome, Human/genetics , Chromosome Mapping , Base Sequence , INDEL Mutation , Neurodevelopmental Disorders/genetics
11.
Neuron ; 111(18): 2800-2810.e5, 2023 09 20.
Article in English | MEDLINE | ID: mdl-37463579

ABSTRACT

Genetic association studies have made significant contributions to our understanding of the etiology of neurodevelopmental disorders (NDDs). However, these studies rarely focused on the African continent. The NeuroDev Project aims to address this diversity gap through detailed phenotypic and genetic characterization of children with NDDs from Kenya and South Africa. We present results from NeuroDev's first year of data collection, including phenotype data from 206 cases and clinical genetic analyses of 99 parent-child trios. Most cases met criteria for global developmental delay/intellectual disability (GDD/ID, 80.3%). Approximately half of the children with GDD/ID also met criteria for autism. Analysis of exome-sequencing data identified a pathogenic or likely pathogenic variant in 13 (17%) of the 75 cases from South Africa and 9 (38%) of the 24 cases from Kenya. Data from the trio pilot are publicly available, and the NeuroDev Project will continue to develop resources for the global genetics community.


Subject(s)
Autistic Disorder , Intellectual Disability , Neurodevelopmental Disorders , Humans , Child , Neurodevelopmental Disorders/genetics , Phenotype , Intellectual Disability/genetics , Autistic Disorder/genetics , Exome , Developmental Disabilities/genetics
12.
Genet Med ; 25(10): 100918, 2023 10.
Article in English | MEDLINE | ID: mdl-37330696

ABSTRACT

PURPOSE: Orofacial clefts (OFCs) are common birth defects including cleft lip, cleft lip and palate, and cleft palate. OFCs have heterogeneous etiologies, complicating clinical diagnostics because it is not always apparent if the cause is Mendelian, environmental, or multifactorial. Sequencing is not currently performed for isolated or sporadic OFCs; therefore, we estimated the diagnostic yield for 418 genes in 841 cases and 294 controls. METHODS: We evaluated 418 genes using genome sequencing and curated variants to assess their pathogenicity using American College of Medical Genetics criteria. RESULTS: 9.04% of cases and 1.02% of controls had "likely pathogenic" variants (P < .0001), which was almost exclusively driven by heterozygous variants in autosomal genes. Cleft palate (17.6%) and cleft lip and palate (9.09%) cases had the highest yield, whereas cleft lip cases had a 2.80% yield. Out of 39 genes with likely pathogenic variants, 9 genes, including CTNND1 and IRF6, accounted for more than half of the yield (4.64% of cases). Most variants (61.8%) were "variants of uncertain significance", occurring more frequently in cases (P = .004), but no individual gene showed a significant excess of variants of uncertain significance. CONCLUSION: These results underscore the etiological heterogeneity of OFCs and suggest sequencing could reduce the diagnostic gap in OFCs.


Subject(s)
Cleft Lip , Cleft Palate , Humans , Cleft Lip/diagnosis , Cleft Lip/genetics , Cleft Palate/diagnosis , Cleft Palate/genetics , Alleles , Chromosome Mapping , Interferon Regulatory Factors/genetics
13.
Elife ; 122023 05 16.
Article in English | MEDLINE | ID: mdl-37190854

ABSTRACT

Dietary compounds can affect the development of inflammatory responses at distant sites. However, the mechanisms involved remain incompletely understood. Here, we addressed the influence on allergic responses of dietary agonists of aryl hydrocarbon receptor (AhR). In cutaneous papain-induced allergy, we found that lack of dietary AhR ligands exacerbates allergic responses. This phenomenon was tissue-specific as airway allergy was unaffected by the diet. In addition, lack of dietary AhR ligands worsened asthma-like allergy in a model of 'atopic march.' Mice deprived of dietary AhR ligands displayed impaired Langerhans cell migration, leading to exaggerated T cell responses. Mechanistically, dietary AhR ligands regulated the inflammatory profile of epidermal cells, without affecting barrier function. In particular, we evidenced TGF-ß hyperproduction in the skin of mice deprived of dietary AhR ligands, explaining Langerhans cell retention. Our work identifies an essential role for homeostatic activation of AhR by dietary ligands in the dampening of cutaneous allergic responses and uncovers the importance of the gut-skin axis in the development of allergic diseases.


Subject(s)
Dermatitis, Atopic , Diet , Hypersensitivity , Receptors, Aryl Hydrocarbon , Animals , Mice , Langerhans Cells , Ligands , Receptors, Aryl Hydrocarbon/agonists , Skin
14.
ArXiv ; 2023 Jan 18.
Article in English | MEDLINE | ID: mdl-36713248

ABSTRACT

Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order and emerging technologies, such as optical genome mapping and long-read DNA or RNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to a consortium such as GREGoR, which is focused on elucidating the underlying cause of rare unsolved genetic disorders.

15.
Nat Commun ; 14(1): 476, 2023 01 30.
Article in English | MEDLINE | ID: mdl-36717561

ABSTRACT

The adaptive immune response is under circadian control, yet, why adaptive immune reactions continue to exhibit circadian changes over long periods of time is unknown. Using a combination of experimental and mathematical modeling approaches, we show here that dendritic cells migrate from the skin to the draining lymph node in a time-of-day-dependent manner, which provides an enhanced likelihood for functional interactions with T cells. Rhythmic expression of TNF in the draining lymph node enhances BMAL1-controlled ICAM-1 expression in high endothelial venules, resulting in lymphocyte infiltration and lymph node expansion. Lymph node cellularity continues to be different for weeks after the initial time-of-day-dependent challenge, which governs the immune response to vaccinations directed against Hepatitis A virus as well as SARS-CoV-2. In this work, we present a mechanistic understanding of the time-of-day dependent development and maintenance of an adaptive immune response, providing a strategy for using time-of-day to optimize vaccination regimes.


Subject(s)
COVID-19 , Circadian Clocks , Humans , COVID-19/prevention & control , SARS-CoV-2 , Adaptive Immunity , Vaccination , Lymph Nodes
16.
Nat Immunol ; 24(1): 84-95, 2023 01.
Article in English | MEDLINE | ID: mdl-36543959

ABSTRACT

In inflamed tissues, monocytes differentiate into macrophages (mo-Macs) or dendritic cells (mo-DCs). In chronic nonresolving inflammation, mo-DCs are major drivers of pathogenic events. Manipulating monocyte differentiation would therefore be an attractive therapeutic strategy. However, how the balance of mo-DC versus mo-Mac fate commitment is regulated is not clear. In the present study, we show that the transcriptional repressors ETV3 and ETV6 control human monocyte differentiation into mo-DCs. ETV3 and ETV6 inhibit interferon (IFN)-stimulated genes; however, their action on monocyte differentiation is independent of IFN signaling. Instead, we find that ETV3 and ETV6 directly repress mo-Mac development by controlling MAFB expression. Mice deficient for Etv6 in monocytes have spontaneous expression of IFN-stimulated genes, confirming that Etv6 regulates IFN responses in vivo. Furthermore, these mice have impaired mo-DC differentiation during inflammation and reduced pathology in an experimental autoimmune encephalomyelitis model. These findings provide information about the molecular control of monocyte fate decision and identify ETV6 as a therapeutic target to redirect monocyte differentiation in inflammatory disorders.


Subject(s)
Dendritic Cells , Monocytes , Animals , Humans , Mice , Cell Differentiation , Cells, Cultured , Inflammation/metabolism , Macrophages , Proto-Oncogene Proteins c-ets/genetics , Proto-Oncogene Proteins c-ets/metabolism , ETS Translocation Variant 6 Protein
17.
J Mol Biol ; 435(2): 167892, 2023 01 30.
Article in English | MEDLINE | ID: mdl-36410474

ABSTRACT

Constrained Coding Regions (CCRs) in the human genome have been derived from DNA sequencing data of large cohorts of healthy control populations, available in the Genome Aggregation Database (gnomAD) [1]. They identify regions depleted of protein-changing variants and thus identify segments of the genome that have been constrained during human evolution. By mapping these DNA-defined regions from genomic coordinates onto the corresponding protein positions and combining this information with protein annotations, we have explored the distribution of CCRs and compared their co-occurrence with different protein functional features, previously annotated at the amino acid level in public databases. As expected, our results reveal that functional amino acids involved in interactions with DNA/RNA, protein-protein contacts and catalytic sites are the protein features most likely to be highly constrained for variation in the control population. More surprisingly, we also found that linear motifs, linear interacting peptides (LIPs), disorder-order transitions upon binding with other protein partners and liquid-liquid phase separating (LLPS) regions are also strongly associated with high constraint for variability. We also compared intra-species constraints in the human CCRs with inter-species conservation and functional residues to explore how such CCRs may contribute to the analysis of protein variants. As has been previously observed, CCRs are only weakly correlated with conservation, suggesting that intraspecies constraints complement interspecies conservation and can provide more information to interpret variant effects.


Subject(s)
Genome, Human , Open Reading Frames , Proteins , Humans , Base Sequence , Genome, Human/genetics , Genomics , Proteins/genetics , Chromosome Mapping
18.
J Adv Res ; 50: 145-158, 2023 08.
Article in English | MEDLINE | ID: mdl-36323370

ABSTRACT

INTRODUCTION: Whole-genome sequencing using nanopore technologies can uncover structural variants, which are DNA rearrangements larger than 50 base pairs. Nanopore technologies can also characterize their boundaries with single-base accuracy, owing to the kilobase-long reads that encompass either full variants or their junctions. Other methods, such as next-generation short read sequencing or PCR assays, are limited in their capabilities to detect or characterize structural variants. However, the existing software for nanopore sequencing data analysis still reports incomplete variant sets, which also contain erroneous calls, a considerable obstacle for the molecular diagnosis or accurate genotyping of populations. METHODS: We compared multiple factors affecting variant calling, such as reference genome version, aligner (minimap2, NGMLR, and lra) choice, and variant caller combinations (Sniffles, CuteSV, SVIM, and NanoVar), to find the optimal group of tools for calling large (>50 kb) deletions and duplications, using data from seven patients exhibiting gross gene defects on SERPINC1 and from a reference variant set as the control. The goal was to obtain the most complete, yet reasonably specific group of large variants using a single cell of PromethION sequencing, which yielded lower depth coverage than short-read sequencing. We also used a custom method for the statistical analysis of the coverage value to refine the resulting datasets. RESULTS: We found that for large deletions and duplications (>50 kb), the existing software performed worse than for smaller ones, in terms of both sensitivity and specificity, and newer tools had not improved this. Our novel software, disCoverage, could polish variant callers' results, improving specificity by up to 62% and sensitivity by 15%, the latter requiring other data or samples. CONCLUSION: We analyzed the current situation of >50-kb copy number variants with nanopore sequencing, which could be improved. The methods presented in this work could help to identify the known deletions and duplications in a set of patients, while also helping to filter out erroneous calls for these variants, which might aid the efforts to characterize a not-yet well-known fraction of genetic variability in the human genome.


Subject(s)
Nanopore Sequencing , Nanopores , Humans , Sequence Analysis, DNA/methods , DNA Copy Number Variations/genetics , Genome, Human
19.
medRxiv ; 2023 Dec 27.
Article in English | MEDLINE | ID: mdl-38234731

ABSTRACT

Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.

20.
medRxiv ; 2023 Aug 13.
Article in English | MEDLINE | ID: mdl-38328047

ABSTRACT

Background: Causal variants underlying rare disorders may remain elusive even after expansive gene panels or exome sequencing (ES). Clinicians and researchers may then turn to genome sequencing (GS), though the added value of this technique and its optimal use remain poorly defined. We therefore investigated the advantages of GS within a phenotypically diverse cohort. Methods: GS was performed for 744 individuals with rare disease who were genetically undiagnosed. Analysis included review of single nucleotide, indel, structural, and mitochondrial variants. Results: We successfully solved 218/744 (29.3%) cases using GS, with most solves involving established disease genes (157/218, 72.0%). Of all solved cases, 148 (67.9%) had previously had non-diagnostic ES. We systematically evaluated the 218 causal variants for features requiring GS to identify and 61/218 (28.0%) met these criteria, representing 8.2% of the entire cohort. These included small structural variants (13), copy neutral inversions and complex rearrangements (8), tandem repeat expansions (6), deep intronic variants (15), and coding variants that may be more easily found using GS related to uniformity of coverage (19). Conclusion: We describe the diagnostic yield of GS in a large and diverse cohort, illustrating several types of pathogenic variation eluding ES or other techniques. Our results reveal a higher diagnostic yield of GS, supporting the utility of a genome-first approach, with consideration of GS as a secondary or tertiary test when higher-resolution structural variant analysis is needed or there is a strong clinical suspicion for a condition and prior targeted genetic testing has been negative.

SELECTION OF CITATIONS
SEARCH DETAIL
...