RESUMEN
In the past decade, human genetics research saw an acceleration of disease gene discovery and further dissection of the genetic architectures of many disorders. Much of this progress was enabled via data aggregation projects, collaborative data sharing among researchers, and the adoption of sophisticated and standardized bioinformatics analyses pipelines. In 2012, we launched the GENESIS platform, formerly known as GEM.app, with the aims to 1) empower clinical and basic researchers without bioinformatics expertise to analyze and explore genome level data and 2) facilitate the detection of novel pathogenic variation and novel disease genes by leveraging data aggregation and genetic matchmaking. The GENESIS database has grown to over 20,000 datasets from rare disease patients, which were provided by multiple academic research consortia and many individual investigators. Some of the largest global collections of genome-level data are available for Charcot-Marie-Tooth disease, hereditary spastic paraplegia, and cerebellar ataxia. A number of rare disease consortia and networks are archiving their data in this database. Over the past decade, more than 1500 scientists have registered and used this resource and published over 200 papers on gene and variant identifications, which garnered >6000 citations. GENESIS has supported >100 gene discoveries and contributed to approximately half of all gene identifications in the fields of inherited peripheral neuropathies and spastic paraplegia in this time frame. Many diagnostic odysseys of rare disease patients have been resolved. The concept of genomes-to-therapy has borne out for a number of such discoveries that let to rapid clinical trials and expedited natural history studies. This marks GENESIS as one of the most impactful data aggregation initiatives in rare monogenic diseases.
RESUMEN
BACKGROUND: Caused by duplications of the gene encoding peripheral myelin protein 22 (PMP22), Charcot-Marie-Tooth disease type 1A (CMT1A) is the most common hereditary neuropathy. Despite this shared genetic origin, there is considerable variability in clinical severity. It is hypothesized that genetic modifiers contribute to this heterogeneity, the identification of which may reveal novel therapeutic targets. In this study, we present a comprehensive analysis of clinical examination results from 1564 CMT1A patients sourced from a prospective natural history study conducted by the RDCRN-INC (Inherited Neuropathy Consortium). Our primary objective is to delineate extreme phenotype profiles (mild and severe) within this patient cohort, thereby enhancing our ability to detect genetic modifiers with large effects. METHODS: We have conducted large-scale statistical analyses of the RDCRN-INC database to characterize CMT1A severity across multiple metrics. RESULTS: We defined patients below the 10th (mild) and above the 90th (severe) percentiles of age-normalized disease severity based on the CMT Examination Score V2 and foot dorsiflexion strength (MRC scale). Based on extreme phenotype categories, we defined a statistically justified recruitment strategy, which we propose to use in future modifier studies. INTERPRETATION: Leveraging whole genome sequencing with base pair resolution, a future genetic modifier evaluation will include single nucleotide association, gene burden tests, and structural variant analysis. The present work not only provides insight into the severity and course of CMT1A, but also elucidates the statistical foundation and practical considerations for a cost-efficient and straightforward patient enrollment strategy that we intend to conduct on additional patients recruited globally.