Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 91
Filter
1.
Genome Med ; 16(1): 70, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38769532

ABSTRACT

BACKGROUND: Rare oncogenic driver events, particularly affecting the expression or splicing of driver genes, are suspected to substantially contribute to the large heterogeneity of hematologic malignancies. However, their identification remains challenging. METHODS: To address this issue, we generated the largest dataset to date of matched whole genome sequencing and total RNA sequencing of hematologic malignancies from 3760 patients spanning 24 disease entities. Taking advantage of our dataset size, we focused on discovering rare regulatory aberrations. Therefore, we called expression and splicing outliers using an extension of the workflow DROP (Detection of RNA Outliers Pipeline) and AbSplice, a variant effect predictor that identifies genetic variants causing aberrant splicing. We next trained a machine learning model integrating these results to prioritize new candidate disease-specific driver genes. RESULTS: We found a median of seven expression outlier genes, two splicing outlier genes, and two rare splice-affecting variants per sample. Each category showed significant enrichment for already well-characterized driver genes, with odds ratios exceeding three among genes called in more than five samples. On held-out data, our integrative modeling significantly outperformed modeling based solely on genomic data and revealed promising novel candidate driver genes. Remarkably, we found a truncated form of the low density lipoprotein receptor LRP1B transcript to be aberrantly overexpressed in about half of hairy cell leukemia variant (HCL-V) samples and, to a lesser extent, in closely related B-cell neoplasms. This observation, which was confirmed in an independent cohort, suggests LRP1B as a novel marker for a HCL-V subclass and a yet unreported functional role of LRP1B within these rare entities. CONCLUSIONS: Altogether, our census of expression and splicing outliers for 24 hematologic malignancy entities and the companion computational workflow constitute unique resources to deepen our understanding of rare oncogenic events in hematologic cancers.


Subject(s)
Hematologic Neoplasms , Transcriptome , Humans , Hematologic Neoplasms/genetics , RNA Splicing , Gene Expression Regulation, Neoplastic , Oncogenes , Gene Expression Profiling , Receptors, LDL/genetics
2.
medRxiv ; 2024 May 04.
Article in English | MEDLINE | ID: mdl-38746462

ABSTRACT

Solve-RD is a pan-European rare disease (RD) research program that aims to identify disease-causing genetic variants in previously undiagnosed RD families. We utilised 10-fold coverage HiFi long-read sequencing (LRS) for detecting causative structural variants (SVs), single nucleotide variants (SNVs), insertion-deletions (InDels), and short tandem repeat (STR) expansions in extensively studied RD families without clear molecular diagnoses. Our cohort includes 293 individuals from 114 genetically undiagnosed RD families selected by European Rare Disease Network (ERN) experts. Of these, 21 families were affected by so-called 'unsolvable' syndromes for which genetic causes remain unknown, and 93 families with at least one individual affected by a rare neurological, neuromuscular, or epilepsy disorder without genetic diagnosis despite extensive prior testing. Clinical interpretation and orthogonal validation of variants in known disease genes yielded thirteen novel genetic diagnoses due to de novo and rare inherited SNVs, InDels, SVs, and STR expansions. In an additional four families, we identified a candidate disease-causing SV affecting several genes including an MCF2 / FGF13 fusion and PSMA3 deletion. However, no common genetic cause was identified in any of the 'unsolvable' syndromes. Taken together, we found (likely) disease-causing genetic variants in 13.0% of previously unsolved families and additional candidate disease-causing SVs in another 4.3% of these families. In conclusion, our results demonstrate the added value of HiFi long-read genome sequencing in undiagnosed rare diseases.

3.
Genome Biol ; 25(1): 83, 2024 Apr 02.
Article in English | MEDLINE | ID: mdl-38566111

ABSTRACT

BACKGROUND: The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS: Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS: Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.


Subject(s)
DNA , Regulatory Sequences, Nucleic Acid , Binding Sites , Sequence Alignment , Algorithms , Conserved Sequence/genetics , Evolution, Molecular
4.
Mol Syst Biol ; 20(5): 506-520, 2024 May.
Article in English | MEDLINE | ID: mdl-38491213

ABSTRACT

Codon optimality is a major determinant of mRNA translation and degradation rates. However, whether and through which mechanisms its effects are regulated remains poorly understood. Here we show that codon optimality associates with up to 2-fold change in mRNA stability variations between human tissues, and that its effect is attenuated in tissues with high energy metabolism and amplifies with age. Mathematical modeling and perturbation data through oxygen deprivation and ATP synthesis inhibition reveal that cellular energy variations non-uniformly alter the effect of codon usage. This new mode of codon effect regulation, independent of tRNA regulation, provides a fundamental mechanistic link between cellular energy metabolism and eukaryotic gene expression.


Subject(s)
Codon , Energy Metabolism , RNA Stability , RNA, Messenger , Humans , Energy Metabolism/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , Codon/genetics , Codon Usage , Protein Biosynthesis , RNA, Transfer/genetics , RNA, Transfer/metabolism , Adenosine Triphosphate/metabolism , Gene Expression Regulation
5.
Sci Rep ; 14(1): 5768, 2024 03 08.
Article in English | MEDLINE | ID: mdl-38459123

ABSTRACT

The SARS-CoV-2 pandemic has highlighted the need to better define in-hospital transmissions, a need that extends to all other common infectious diseases encountered in clinical settings. To evaluate how whole viral genome sequencing can contribute to deciphering nosocomial SARS-CoV-2 transmission 926 SARS-CoV-2 viral genomes from 622 staff members and patients were collected between February 2020 and January 2021 at a university hospital in Munich, Germany, and analysed along with the place of work, duration of hospital stay, and ward transfers. Bioinformatically defined transmission clusters inferred from viral genome sequencing were compared to those inferred from interview-based contact tracing. An additional dataset collected at the same time at another university hospital in the same city was used to account for multiple independent introductions. Clustering analysis of 619 viral genomes generated 19 clusters ranging from 3 to 31 individuals. Sequencing-based transmission clusters showed little overlap with those based on contact tracing data. The viral genomes were significantly more closely related to each other than comparable genomes collected simultaneously at other hospitals in the same city (n = 829), suggesting nosocomial transmission. Longitudinal sampling from individual patients suggested possible cross-infection events during the hospital stay in 19.2% of individuals (14 of 73 individuals). Clustering analysis of SARS-CoV-2 whole genome sequences can reveal cryptic transmission events missed by classical, interview-based contact tracing, helping to decipher in-hospital transmissions. These results, in line with other studies, advocate for viral genome sequencing as a pathogen transmission surveillance tool in hospitals.


Subject(s)
COVID-19 , Cross Infection , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , COVID-19/genetics , Genome, Viral/genetics , Cross Infection/epidemiology , Cross Infection/genetics , Hospitals, University
6.
bioRxiv ; 2024 Jan 09.
Article in English | MEDLINE | ID: mdl-38260253

ABSTRACT

Aging and neurodegeneration entail diverse cellular and molecular hallmarks. Here, we studied the effects of aging on the transcriptome, translatome, and multiple layers of the proteome in the brain of a short-lived killifish. We reveal that aging causes widespread reduction of proteins enriched in basic amino acids that is independent of mRNA regulation, and it is not due to impaired proteasome activity. Instead, we identify a cascade of events where aberrant translation pausing leads to reduced ribosome availability resulting in proteome remodeling independently of transcriptional regulation. Our research uncovers a vulnerable point in the aging brain's biology - the biogenesis of basic DNA/RNA binding proteins. This vulnerability may represent a unifying principle that connects various aging hallmarks, encompassing genome integrity and the biosynthesis of macromolecules.

7.
Nat Commun ; 15(1): 151, 2024 Jan 02.
Article in English | MEDLINE | ID: mdl-38167372

ABSTRACT

Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a de novo peptide sequencing method for tandem mass spectrometry. Spectralis leverages several innovations including a convolutional neural network layer connecting peaks in spectra spaced by amino acid masses, proposing fragment ion series classification as a pivotal task for de novo peptide sequencing, and a peptide-spectrum confidence score. On spectra for which database search provided a ground truth, Spectralis surpassed 40% sensitivity at 90% precision, nearly doubling state-of-the-art sensitivity. Application to unidentified spectra confirmed its superiority and showcased its applicability to variant calling. Altogether, these algorithmic innovations and the substantial sensitivity increase in the high-precision range constitute an important step toward broadly applicable peptide sequencing.


Subject(s)
Deep Learning , Algorithms , Sequence Analysis, Protein/methods , Peptides/chemistry , Amino Acid Sequence
8.
Nat Methods ; 21(1): 28-31, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38049697

ABSTRACT

Single-cell ATAC sequencing coverage in regulatory regions is typically binarized as an indicator of open chromatin. Here we show that binarization is an unnecessary step that neither improves goodness of fit, clustering, cell type identification nor batch integration. Fragment counts, but not read counts, should instead be modeled, which preserves quantitative regulatory information. These results have immediate implications for single-cell ATAC sequencing analysis.


Subject(s)
Chromatin Immunoprecipitation Sequencing , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Chromatin/genetics , Single-Cell Analysis
9.
NAR Genom Bioinform ; 5(4): lqad095, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37942285

ABSTRACT

Functional gene embeddings, numerical vectors capturing gene function, provide a promising way to integrate functional gene information into machine learning models. These embeddings are learnt by applying self-supervised machine-learning algorithms on various data types including quantitative omics measurements, protein-protein interaction networks and literature. However, downstream evaluations comparing alternative data modalities used to construct functional gene embeddings have been lacking. Here we benchmarked functional gene embeddings obtained from various data modalities for predicting disease-gene lists, cancer drivers, phenotype-gene associations and scores from genome-wide association studies. Off-the-shelf predictors trained on precomputed embeddings matched or outperformed dedicated state-of-the-art predictors, demonstrating their high utility. Embeddings based on literature and protein-protein interactions inferred from low-throughput experiments outperformed embeddings derived from genome-wide experimental data (transcriptomics, deletion screens and protein sequence) when predicting curated gene lists. In contrast, they did not perform better when predicting genome-wide association signals and were biased towards highly-studied genes. These results indicate that embeddings derived from literature and low-throughput experiments appear favourable in many existing benchmarks because they are biased towards well-studied genes and should therefore be considered with caution. Altogether, our study and precomputed embeddings will facilitate the development of machine-learning models in genetics and related fields.

10.
Am J Hum Genet ; 110(12): 2056-2067, 2023 Dec 07.
Article in English | MEDLINE | ID: mdl-38006880

ABSTRACT

Detection of aberrantly spliced genes is an important step in RNA-seq-based rare-disease diagnostics. We recently developed FRASER, a denoising autoencoder-based method that outperformed alternative methods of detecting aberrant splicing. However, because FRASER's three splice metrics are partially redundant and tend to be sensitive to sequencing depth, we introduce here a more robust intron-excision metric, the intron Jaccard index, that combines the alternative donor, alternative acceptor, and intron-retention signal into a single value. Moreover, we optimized model parameters and filter cutoffs by using candidate rare-splice-disrupting variants as independent evidence. On 16,213 GTEx samples, our improved algorithm, FRASER 2.0, called typically 10 times fewer splicing outliers while increasing the proportion of candidate rare-splice-disrupting variants by 10-fold and substantially decreasing the effect of sequencing depth on the number of reported outliers. To lower the multiple-testing correction burden, we introduce an option to select the genes to be tested for each sample instead of a transcriptome-wide approach. This option can be particularly useful when prior information, such as candidate variants or genes, is available. Application on 303 rare-disease samples confirmed the relative reduction in the number of outlier calls for a slight loss of sensitivity; FRASER 2.0 recovered 22 out of 26 previously identified pathogenic splicing cases with default cutoffs and 24 when multiple-testing correction was limited to OMIM genes containing rare variants. Altogether, these methodological improvements contribute to more effective RNA-seq-based rare diagnostics by drastically reducing the amount of splicing outlier calls per sample at minimal loss of sensitivity.


Subject(s)
Alternative Splicing , RNA Splicing , Humans , Alternative Splicing/genetics , Introns/genetics , RNA Splicing/genetics , RNA-Seq , Algorithms
11.
Genome Biol ; 24(1): 180, 2023 08 04.
Article in English | MEDLINE | ID: mdl-37542318

ABSTRACT

We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions.


Subject(s)
Base Sequence , Computer Simulation , Deep Learning , RNA-Binding Proteins , RNA , Humans , Alleles , Bias , Binding Sites , Consensus Sequence , Datasets as Topic , Internet , Mutation , Nucleotide Motifs , Nucleotides/metabolism , RNA/chemistry , RNA/genetics , RNA/metabolism , RNA Splice Sites , RNA, Messenger/chemistry , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Viral/chemistry , RNA, Viral/genetics , RNA, Viral/metabolism , RNA-Binding Proteins/chemistry , RNA-Binding Proteins/metabolism
12.
Nat Genet ; 55(5): 861-870, 2023 05.
Article in English | MEDLINE | ID: mdl-37142848

ABSTRACT

Aberrant splicing is a major cause of genetic disorders but its direct detection in transcriptomes is limited to clinically accessible tissues such as skin or body fluids. While DNA-based machine learning models can prioritize rare variants for affecting splicing, their performance in predicting tissue-specific aberrant splicing remains unassessed. Here we generated an aberrant splicing benchmark dataset, spanning over 8.8 million rare variants in 49 human tissues from the Genotype-Tissue Expression (GTEx) dataset. At 20% recall, state-of-the-art DNA-based models achieve maximum 12% precision. By mapping and quantifying tissue-specific splice site usage transcriptome-wide and modeling isoform competition, we increased precision by threefold at the same recall. Integrating RNA-sequencing data of clinically accessible tissues into our model, AbSplice, brought precision to 60%. These results, replicated in two independent cohorts, substantially contribute to noncoding loss-of-function variant identification and to genetic diagnostics design and analytics.


Subject(s)
Alternative Splicing , RNA Splicing , Humans , RNA Splicing/genetics , Alternative Splicing/genetics , Sequence Analysis, RNA/methods , Transcriptome , Protein Isoforms
13.
medRxiv ; 2023 May 11.
Article in English | MEDLINE | ID: mdl-37214898

ABSTRACT

Genome-wide association studies have unearthed a wealth of genetic associations across many complex diseases. However, translating these associations into biological mechanisms contributing to disease etiology and heterogeneity has been challenging. Here, we hypothesize that the effects of disease-associated genetic variants converge onto distinct cell type specific molecular pathways within distinct subgroups of patients. In order to test this hypothesis, we develop the CASTom-iGEx pipeline to operationalize individual level genotype data to interpret personal polygenic risk and identify the genetic basis of clinical heterogeneity. The paradigmatic application of this approach to coronary artery disease and schizophrenia reveals a convergence of disease associated variant effects onto known and novel genes, pathways, and biological processes. The biological process specific genetic liabilities are not equally distributed across patients. Instead, they defined genetically distinct groups of patients, characterized by different profiles across pathways, endophenotypes, and disease severity. These results provide further evidence for a genetic contribution to clinical heterogeneity and point to the existence of partially distinct pathomechanisms across patient subgroups. Thus, the universally applicable approach presented here has the potential to constitute an important component of future personalized medicine concepts.

14.
medRxiv ; 2023 Apr 03.
Article in English | MEDLINE | ID: mdl-37066374

ABSTRACT

Detection of aberrantly spliced genes is an important step in RNA-seq-based rare disease diagnostics. We recently developed FRASER, a denoising autoencoder-based method for aberrant splicing detection that outperformed alternative approaches. However, as FRASER's three splice metrics are partially redundant and tend to be sensitive to sequencing depth, we introduce here a more robust intron excision metric, the Intron Jaccard Index, that combines alternative donor, alternative acceptor, and intron retention signal into a single value. Moreover, we optimized model parameters and filter cutoffs using candidate rare splice-disrupting variants as independent evidence. On 16,213 GTEx samples, our improved algorithm called typically 10 times fewer splicing outliers while increasing the proportion of candidate rare splice-disrupting variants by 10 fold and substantially decreasing the effect of sequencing depth on the number of reported outliers. Application on 303 rare disease samples confirmed the reduction fold-change of the number of outlier calls for a slight loss of sensitivity (only 2 out of 22 previously identified pathogenic splicing cases not recovered). Altogether, these methodological improvements contribute to more effective RNA-seq-based rare diagnostics by a drastic reduction of the amount of splicing outlier calls per sample at minimal loss of sensitivity.

15.
Nat Biotechnol ; 41(12): 1787-1800, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37012447

ABSTRACT

The epicardium, the mesothelial envelope of the vertebrate heart, is the source of multiple cardiac cell lineages during embryonic development and provides signals that are essential to myocardial growth and repair. Here we generate self-organizing human pluripotent stem cell-derived epicardioids that display retinoic acid-dependent morphological, molecular and functional patterning of the epicardium and myocardium typical of the left ventricular wall. By combining lineage tracing, single-cell transcriptomics and chromatin accessibility profiling, we describe the specification and differentiation process of different cell lineages in epicardioids and draw comparisons to human fetal development at the transcriptional and morphological levels. We then use epicardioids to investigate the functional cross-talk between cardiac cell types, gaining new insights into the role of IGF2/IGF1R and NRP2 signaling in human cardiogenesis. Finally, we show that epicardioids mimic the multicellular pathogenesis of congenital or stress-induced hypertrophy and fibrotic remodeling. As such, epicardioids offer a unique testing ground of epicardial activity in heart development, disease and regeneration.


Subject(s)
Heart , Pericardium , Humans , Pericardium/metabolism , Myocardium , Cell Differentiation/genetics , Cell Lineage/genetics , Biology
16.
Genome Biol ; 24(1): 56, 2023 03 27.
Article in English | MEDLINE | ID: mdl-36973806

ABSTRACT

BACKGROUND: The largest sequence-based models of transcription control to date are obtained by predicting genome-wide gene regulatory assays across the human genome. This setting is fundamentally correlative, as those models are exposed during training solely to the sequence variation between human genes that arose through evolution, questioning the extent to which those models capture genuine causal signals. RESULTS: Here we confront predictions of state-of-the-art models of transcription regulation against data from two large-scale observational studies and five deep perturbation assays. The most advanced of these sequence-based models, Enformer, by and large, captures causal determinants of human promoters. However, models fail to capture the causal effects of enhancers on expression, notably in medium to long distances and particularly for highly expressed promoters. More generally, the predicted impact of distal elements on gene expression predictions is small and the ability to correctly integrate long-range information is significantly more limited than the receptive fields of the models suggest. This is likely caused by the escalating class imbalance between actual and candidate regulatory elements as distance increases. CONCLUSIONS: Our results suggest that sequence-based models have advanced to the point that in silico study of promoter regions and promoter variants can provide meaningful insights and we provide practical guidance on how to use them. Moreover, we foresee that it will require significantly more and particularly new kinds of data to train models accurately accounting for distal elements.


Subject(s)
Enhancer Elements, Genetic , Genomics , Humans , Genomics/methods , Promoter Regions, Genetic , Gene Expression Regulation , Gene Expression
17.
Nucleic Acids Res ; 51(4): e21, 2023 02 28.
Article in English | MEDLINE | ID: mdl-36617985

ABSTRACT

Transposon screens are powerful in vivo assays used to identify loci driving carcinogenesis. These loci are identified as Common Insertion Sites (CISs), i.e. regions with more transposon insertions than expected by chance. However, the identification of CISs is affected by biases in the insertion behaviour of transposon systems. Here, we introduce Transmicron, a novel method that differs from previous methods by (i) modelling neutral insertion rates based on chromatin accessibility, transcriptional activity and sequence context and (ii) estimating oncogenic selection for each genomic region using Poisson regression to model insertion counts while controlling for neutral insertion rates. To assess the benefits of our approach, we generated a dataset applying two different transposon systems under comparable conditions. Benchmarking for enrichment of known cancer genes showed improved performance of Transmicron against state-of-the-art methods. Modelling neutral insertion rates allowed for better control of false positives and stronger agreement of the results between transposon systems. Moreover, using Poisson regression to consider intra-sample and inter-sample information proved beneficial in small and moderately-sized datasets. Transmicron is open-source and freely available. Overall, this study contributes to the understanding of transposon biology and introduces a novel approach to use this knowledge for discovering cancer driver genes.


Subject(s)
DNA Transposable Elements , Neoplasms , Software , Humans , Base Sequence , Carcinogenesis , Mutagenesis, Insertional , Oncogenes , Neoplasms/genetics
18.
Bioinformatics ; 39(2)2023 02 03.
Article in English | MEDLINE | ID: mdl-36708003

ABSTRACT

MOTIVATION: Identifying regulatory regions in the genome is of great interest for understanding the epigenomic landscape in cells. One fundamental challenge in this context is to find the target genes whose expression is affected by the regulatory regions. A recent successful method is the Activity-By-Contact (ABC) model which scores enhancer-gene interactions based on enhancer activity and the contact frequency of an enhancer to its target gene. However, it describes regulatory interactions entirely from a gene's perspective, and does not account for all the candidate target genes of an enhancer. In addition, the ABC model requires two types of assays to measure enhancer activity, which limits the applicability. Moreover, there is neither implementation available that could allow for an integration with transcription factor (TF) binding information nor an efficient analysis of single-cell data. RESULTS: We demonstrate that the ABC score can yield a higher accuracy by adapting the enhancer activity according to the number of contacts the enhancer has to its candidate target genes and also by considering all annotated transcription start sites of a gene. Further, we show that the model is comparably accurate with only one assay to measure enhancer activity. We combined our generalized ABC model with TF binding information and illustrated an analysis of a single-cell ATAC-seq dataset of the human heart, where we were able to characterize cell type-specific regulatory interactions and predict gene expression based on TF affinities. All executed processing steps are incorporated into our new computational pipeline STARE. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/schulzlab/STARE. CONTACT: marcel.schulz@em.uni-frankfurt.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Expression Regulation , Transcription Factors , Humans , Transcription Factors/metabolism , Regulatory Sequences, Nucleic Acid , Software , Protein Binding
19.
Brain Pathol ; 33(3): e13134, 2023 05.
Article in English | MEDLINE | ID: mdl-36450274

ABSTRACT

Mitochondrial translation defects are a continuously growing group of disorders showing a large variety of clinical symptoms including a wide range of neurological abnormalities. To date, mutations in PTCD3, encoding a component of the mitochondrial ribosome, have only been reported in a single individual with clinical evidence of Leigh syndrome. Here, we describe three additional PTCD3 individuals from two unrelated families, broadening the genetic and phenotypic spectrum of this disorder, and provide definitive evidence that PTCD3 deficiency is associated with Leigh syndrome. The patients presented in the first months of life with psychomotor delay, respiratory insufficiency and feeding difficulties. The neurologic phenotype included dystonia, optic atrophy, nystagmus and tonic-clonic seizures. Brain MRI showed optic nerve atrophy and thalamic changes, consistent with Leigh syndrome. WES and RNA-seq identified compound heterozygous variants in PTCD3 in both families: c.[1453-1G>C];[1918C>G] and c.[710del];[902C>T]. The functional consequences of the identified variants were determined by a comprehensive characterization of the mitochondrial function. PTCD3 protein levels were significantly reduced in patient fibroblasts and, consistent with a mitochondrial translation defect, a severe reduction in the steady state levels of complexes I and IV subunits was detected. Accordingly, the activity of these complexes was also low, and high-resolution respirometry showed a significant decrease in the mitochondrial respiratory capacity. Functional complementation studies demonstrated the pathogenic effect of the identified variants since the expression of wild-type PTCD3 in immortalized fibroblasts restored the steady-state levels of complexes I and IV subunits as well as the mitochondrial respiratory capacity. Additionally, minigene assays demonstrated that three of the identified variants were pathogenic by altering PTCD3 mRNA processing. The fourth variant was a frameshift leading to a truncated protein. In summary, we provide evidence of PTCD3 involvement in human disease confirming that PTCD3 deficiency is definitively associated with Leigh syndrome.


Subject(s)
Arabidopsis Proteins , Leigh Disease , Humans , Leigh Disease/genetics , Leigh Disease/pathology , Mitochondria/pathology , Proteins/genetics , Mutation/genetics , Phenotype , RNA-Binding Proteins , Arabidopsis Proteins/genetics
20.
NPJ Genom Med ; 7(1): 74, 2022 Dec 28.
Article in English | MEDLINE | ID: mdl-36577754

ABSTRACT

RNA sequencing (RNA-seq) is emerging in genetic diagnoses as it provides functional support for the interpretation of variants of uncertain significance. However, the use of amniotic fluid (AF) cells for RNA-seq has not yet been explored. Here, we examined the expression of clinically relevant genes in AF cells (n = 48) compared with whole blood and fibroblasts. The number of well-expressed genes in AF cells was comparable to that in fibroblasts and much higher than that in blood across different disease categories. We found AF cells RNA-seq feasible and beneficial in prenatal diagnosis (n = 4) as transcriptomic data elucidated the molecular consequence leading to the pathogenicity upgrade of variants in CHD7 and COL1A2 and revising the in silico prediction of a variant in MYRF. AF cells RNA-seq could become a reasonable choice for postnatal patients with advantages over fibroblasts and blood as it prevents invasive procedures.

SELECTION OF CITATIONS
SEARCH DETAIL
...