Search | VHL Regional Portal

1.

SimReadUntil for benchmarking selective sequencing algorithms on ONT devices.

Mordig, Maximilian; Rätsch, Gunnar; Kahles, André.

Bioinformatics ; 40(5)2024 May 02.

Article in English | MEDLINE | ID: mdl-38603597

ABSTRACT

MOTIVATION: The Oxford Nanopore Technologies (ONT) ReadUntil API enables selective sequencing, which aims to selectively favor interesting over uninteresting reads, e.g. to deplete or enrich certain genomic regions. The performance gain depends on the selective sequencing decision-making algorithm (SSDA) which decides whether to reject a read, stop receiving a read, or wait for more data. Since real runs are time-consuming and costly, simulating the ONT sequencer with support for the ReadUntil API is highly beneficial for comparing and optimizing new SSDAs. Existing software like MinKNOW and UNCALLED only return raw signal data, are memory-intensive, require huge and often unavailable multi-fast5 files (≥100GB) and are not clearly documented. RESULTS: We present the ONT device simulator SimReadUntil that takes a set of full reads as input, distributes them to channels and plays them back in real time including mux scans, channel gaps and blockages, and allows to reject reads as well as stop receiving data from them. Our modified ReadUntil API provides the basecalled reads rather than the raw signal, reducing computational load and focusing on the SSDA rather than on basecalling. Tuning the parameters of tools like ReadFish and ReadBouncer becomes easier because a GPU for basecalling is no longer required. We offer various methods to extract simulation parameters from a sequencing summary file and adapt ReadFish to replicate one of their enrichment experiments. SimReadUntil's gRPC interface allows standardized interaction with a wide range of programming languages. AVAILABILITY AND IMPLEMENTATION: Code and fully worked examples are available on GitHub (https://github.com/ratschlab/sim_read_until).

Subject(s)

Algorithms , Benchmarking , Software , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Nanopore Sequencing/methods

2.

Modeling multiple sclerosis using mobile and wearable sensor data.

Gashi, Shkurta; Oldrati, Pietro; Moebus, Max; Hilty, Marc; Barrios, Liliana; Ozdemir, Firat; Kana, Veronika; Lutterotti, Andreas; Rätsch, Gunnar; Holz, Christian.

NPJ Digit Med ; 7(1): 64, 2024 Mar 11.

Article in English | MEDLINE | ID: mdl-38467710

ABSTRACT

Multiple sclerosis (MS) is a neurological disease of the central nervous system that is the leading cause of non-traumatic disability in young adults. Clinical laboratory tests and neuroimaging studies are the standard methods to diagnose and monitor MS. However, due to infrequent clinic visits, it is fundamental to identify remote and frequent approaches for monitoring MS, which enable timely diagnosis, early access to treatment, and slowing down disease progression. In this work, we investigate the most reliable, clinically useful, and available features derived from mobile and wearable devices as well as their ability to distinguish people with MS (PwMS) from healthy controls, recognize MS disability and fatigue levels. To this end, we formalize clinical knowledge and derive behavioral markers to characterize MS. We evaluate our approach on a dataset we collected from 55 PwMS and 24 healthy controls for a total of 489 days conducted in free-living conditions. The dataset contains wearable sensor data - e.g., heart rate - collected using an arm-worn device, smartphone data - e.g., phone locks - collected through a mobile application, patient health records - e.g., MS type - obtained from the hospital, and self-reports - e.g., fatigue level - collected using validated questionnaires administered via the mobile application. Our results demonstrate the feasibility of using features derived from mobile and wearable sensors to monitor MS. Our findings open up opportunities for continuous monitoring of MS in free-living conditions and can be used to evaluate and guide the effectiveness of treatments, manage the disease, and identify participants for clinical trials.

3.

Author Correction: Learning single-cell perturbation responses using neural optimal transport.

Bunne, Charlotte; Stark, Stefan G; Gut, Gabriele; Del Castillo, Jacobo Sarabia; Levesque, Mitch; Lehmann, Kjong-Van; Pelkmans, Lucas; Krause, Andreas; Rätsch, Gunnar.

Nat Methods ; 20(11): 1830, 2023 Nov.

Article in English | MEDLINE | ID: mdl-37903912

4.

Mutant SF3B1 promotes malignancy in PDAC.

Simmler, Patrik; Ioannidi, Eleonora I; Mengis, Tamara; Marquart, Kim Fabiano; Asawa, Simran; Van-Lehmann, Kjong; Kahles, Andre; Thomas, Tinu; Schwerdel, Cornelia; Aceto, Nicola; Rätsch, Gunnar; Stoffel, Markus; Schwank, Gerald.

Elife ; 122023 10 12.

Article in English | MEDLINE | ID: mdl-37823551

ABSTRACT

The splicing factor SF3B1 is recurrently mutated in various tumors, including pancreatic ductal adenocarcinoma (PDAC). The impact of the hotspot mutation SF3B1K700E on the PDAC pathogenesis, however, remains elusive. Here, we demonstrate that Sf3b1K700E alone is insufficient to induce malignant transformation of the murine pancreas, but that it increases aggressiveness of PDAC if it co-occurs with mutated KRAS and p53. We further show that Sf3b1K700E already plays a role during early stages of pancreatic tumor progression and reduces the expression of TGF-ß1-responsive epithelial-mesenchymal transition (EMT) genes. Moreover, we found that SF3B1K700E confers resistance to TGF-ß1-induced cell death in pancreatic organoids and cell lines, partly mediated through aberrant splicing of Map3k7. Overall, our findings demonstrate that SF3B1K700E acts as an oncogenic driver in PDAC, and suggest that it promotes the progression of early stage tumors by impeding the cellular response to tumor suppressive effects of TGF-ß.

Subject(s)

Carcinoma, Pancreatic Ductal , Pancreatic Neoplasms , Animals , Humans , Mice , Carcinoma, Pancreatic Ductal/pathology , Cell Line, Tumor , Mutation , Pancreatic Ducts/metabolism , Pancreatic Neoplasms/pathology , Phosphoproteins/metabolism , RNA Splicing Factors/metabolism , Transcription Factors/metabolism , Transforming Growth Factor beta1/metabolism , Pancreatic Neoplasms

5.

Learning single-cell perturbation responses using neural optimal transport.

Bunne, Charlotte; Stark, Stefan G; Gut, Gabriele; Del Castillo, Jacobo Sarabia; Levesque, Mitch; Lehmann, Kjong-Van; Pelkmans, Lucas; Krause, Andreas; Rätsch, Gunnar.

Nat Methods ; 20(11): 1759-1768, 2023 Nov.

Article in English | MEDLINE | ID: mdl-37770709

ABSTRACT

Understanding and predicting molecular responses in single cells upon chemical, genetic or mechanical perturbations is a core question in biology. Obtaining single-cell measurements typically requires the cells to be destroyed. This makes learning heterogeneous perturbation responses challenging as we only observe unpaired distributions of perturbed or non-perturbed cells. Here we leverage the theory of optimal transport and the recent advent of input convex neural architectures to present CellOT, a framework for learning the response of individual cells to a given perturbation by mapping these unpaired distributions. CellOT outperforms current methods at predicting single-cell drug responses, as profiled by scRNA-seq and a multiplexed protein-imaging technology. Further, we illustrate that CellOT generalizes well on unseen settings by (1) predicting the scRNA-seq responses of holdout patients with lupus exposed to interferon-ß and patients with glioblastoma to panobinostat; (2) inferring lipopolysaccharide responses across different species; and (3) modeling the hematopoietic developmental trajectories of different subpopulations.

Subject(s)

Gene Expression Profiling , Single-Cell Analysis , Humans , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods , Gene Expression Profiling/methods

6.

ResMiCo: Increasing the quality of metagenome-assembled genomes with deep learning.

Mineeva, Olga; Danciu, Daniel; Schölkopf, Bernhard; Ley, Ruth E; Rätsch, Gunnar; Youngblut, Nicholas D.

PLoS Comput Biol ; 19(5): e1011001, 2023 05.

Article in English | MEDLINE | ID: mdl-37126495

ABSTRACT

The number of published metagenome assemblies is rapidly growing due to advances in sequencing technologies. However, sequencing errors, variable coverage, repetitive genomic regions, and other factors can produce misassemblies, which are challenging to detect for taxonomically novel genomic data. Assembly errors can affect all downstream analyses of the assemblies. Accuracy for the state of the art in reference-free misassembly prediction does not exceed an AUPRC of 0.57, and it is not clear how well these models generalize to real-world data. Here, we present the Residual neural network for Misassembled Contig identification (ResMiCo), a deep learning approach for reference-free identification of misassembled contigs. To develop ResMiCo, we first generated a training dataset of unprecedented size and complexity that can be used for further benchmarking and developments in the field. Through rigorous validation, we show that ResMiCo is substantially more accurate than the state of the art, and the model is robust to novel taxonomic diversity and varying assembly methods. ResMiCo estimated 7% misassembled contigs per metagenome across multiple real-world datasets. We demonstrate how ResMiCo can be used to optimize metagenome assembly hyperparameters to improve accuracy, instead of optimizing solely for contiguity. The accuracy, robustness, and ease-of-use of ResMiCo make the tool suitable for general quality control of metagenome assemblies and assembly methodology optimization.

Subject(s)

Deep Learning , Metagenome , Metagenome/genetics , Genomics/methods , Sequence Analysis, DNA/methods , Metagenomics , Software

7.

Aligning distant sequences to graphs using long seed sketches.

Joudaki, Amir; Meterez, Alexandru; Mustafa, Harun; Groot Koerkamp, Ragnar; Kahles, André; Rätsch, Gunnar.

Genome Res ; 33(7): 1208-1217, 2023 07.

Article in English | MEDLINE | ID: mdl-37072187

ABSTRACT

Sequence-to-graph alignment is crucial for applications such as variant genotyping, read error correction, and genome assembly. We propose a novel seeding approach that relies on long inexact matches rather than short exact matches, and show that it yields a better time-accuracy trade-off in settings with up to a [Formula: see text] mutation rate. We use sketches of a subset of graph nodes, which are more robust to indels, and store them in a k-nearest neighbor index to avoid the curse of dimensionality. Our approach contrasts with existing methods and highlights the important role that sketching into vector space can play in bioinformatics applications. We show that our method scales to graphs with 1 billion nodes and has quasi-logarithmic query time for queries with an edit distance of [Formula: see text] For such queries, longer sketch-based seeds yield a [Formula: see text] increase in recall compared with exact seeds. Our approach can be incorporated into other aligners, providing a novel direction for sequence-to-graph alignment.

Subject(s)

Algorithms , Computational Biology , Computational Biology/methods , Sequence Alignment , Sequence Analysis, DNA/methods

8.

PipeIT2: A tumor-only somatic variant calling workflow for molecular diagnostic Ion Torrent sequencing data.

Schnidrig, Desiree; Garofoli, Andrea; Benjak, Andrej; Rätsch, Gunnar; Rubin, Mark A; Piscuoglio, Salvatore; Ng, Charlotte K Y.

Genomics ; 115(2): 110587, 2023 03.

Article in English | MEDLINE | ID: mdl-36796655

ABSTRACT

Precision oncology relies on the accurate identification of somatic mutations in cancer patients. While the sequencing of the tumoral tissue is frequently part of routine clinical care, the healthy counterparts are rarely sequenced. We previously published PipeIT, a somatic variant calling workflow specific for Ion Torrent sequencing data enclosed in a Singularity container. PipeIT combines user-friendly execution, reproducibility and reliable mutation identification, but relies on matched germline sequencing data to exclude germline variants. Expanding on the original PipeIT, here we describe PipeIT2 to address the clinical need to define somatic mutations in the absence of germline control. We show that PipeIT2 achieves a > 95% recall for variants with variant allele fraction >10%, reliably detects driver and actionable mutations and filters out most of the germline mutations and sequencing artifacts. With its performance, reproducibility, and ease of execution, PipeIT2 is a valuable addition to molecular diagnostics laboratories.

Subject(s)

Neoplasms , Humans , Neoplasms/diagnosis , Neoplasms/genetics , Pathology, Molecular , Workflow , Reproducibility of Results , Precision Medicine , Mutation , High-Throughput Nucleotide Sequencing

9.

Author Correction: Genomic basis for RNA alterations in cancer.

Calabrese, Claudia; Davidson, Natalie R; Demircioglu, Deniz; Fonseca, Nuno A; He, Yao; Kahles, André; Lehmann, Kjong-Van; Liu, Fenglin; Shiraishi, Yuichi; Soulette, Cameron M; Urban, Lara; Greger, Liliana; Li, Siliang; Liu, Dongbing; Perry, Marc D; Xiang, Qian; Zhang, Fan; Zhang, Junjun; Bailey, Peter; Erkek, Serap; Hoadley, Katherine A; Hou, Yong; Huska, Matthew R; Kilpinen, Helena; Korbel, Jan O; Marin, Maximillian G; Markowski, Julia; Nandi, Tannistha; Pan-Hammarström, Qiang; Pedamallu, Chandra Sekhar; Siebert, Reiner; Stark, Stefan G; Su, Hong; Tan, Patrick; Waszak, Sebastian M; Yung, Christina; Zhu, Shida; Awadalla, Philip; Creighton, Chad J; Meyerson, Matthew; Ouellette, B F Francis; Wu, Kui; Yang, Huanming; Brazma, Alvis; Brooks, Angela N; Göke, Jonathan; Rätsch, Gunnar; Schwarz, Roland F; Stegle, Oliver; Zhang, Zemin.

Nature ; 614(7948): E37, 2023 Feb.

Article in English | MEDLINE | ID: mdl-36697831

10.

Integrated multi-omics reveals anaplerotic rewiring in methylmalonyl-CoA mutase deficiency.

Forny, Patrick; Bonilla, Ximena; Lamparter, David; Shao, Wenguang; Plessl, Tanja; Frei, Caroline; Bingisser, Anna; Goetze, Sandra; van Drogen, Audrey; Harshman, Keith; Pedrioli, Patrick G A; Howald, Cedric; Poms, Martin; Traversi, Florian; Bürer, Céline; Cherkaoui, Sarah; Morscher, Raphael J; Simmons, Luke; Forny, Merima; Xenarios, Ioannis; Aebersold, Ruedi; Zamboni, Nicola; Rätsch, Gunnar; Dermitzakis, Emmanouil T; Wollscheid, Bernd; Baumgartner, Matthias R; Froese, D Sean.

Nat Metab ; 5(1): 80-95, 2023 01.

Article in English | MEDLINE | ID: mdl-36717752

ABSTRACT

Methylmalonic aciduria (MMA) is an inborn error of metabolism with multiple monogenic causes and a poorly understood pathogenesis, leading to the absence of effective causal treatments. Here we employ multi-layered omics profiling combined with biochemical and clinical features of individuals with MMA to reveal a molecular diagnosis for 177 out of 210 (84%) cases, the majority (148) of whom display pathogenic variants in methylmalonyl-CoA mutase (MMUT). Stratification of these data layers by disease severity shows dysregulation of the tricarboxylic acid cycle and its replenishment (anaplerosis) by glutamine. The relevance of these disturbances is evidenced by multi-organ metabolomics of a hemizygous Mmut mouse model as well as through identification of physical interactions between MMUT and glutamine anaplerotic enzymes. Using stable-isotope tracing, we find that treatment with dimethyl-oxoglutarate restores deficient tricarboxylic acid cycling. Our work highlights glutamine anaplerosis as a potential therapeutic intervention point in MMA.

Subject(s)

Metabolism, Inborn Errors , Methylmalonyl-CoA Mutase , Mice , Animals , Methylmalonyl-CoA Mutase/genetics , Methylmalonyl-CoA Mutase/metabolism , Glutamine , Multiomics , Metabolism, Inborn Errors/genetics

11.

Integrated longitudinal analysis of adult grade 4 diffuse gliomas with long-term relapse interval revealed upregulation of TGF-ß signaling in recurrent tumors.

Kashani, Elham; Schnidrig, Désirée; Gheinani, Ali Hashemi; Ninck, Martina Selina; Zens, Philipp; Maragkou, Theoni; Baumgartner, Ulrich; Schucht, Philippe; Rätsch, Gunnar; Rubin, Mark A; Berezowska, Sabina; Ng, Charlotte K Y; Vassella, Erik.

Neuro Oncol ; 25(4): 662-673, 2023 04 06.

Article in English | MEDLINE | ID: mdl-36124685

ABSTRACT

BACKGROUND: Adult-type diffuse gliomas, CNS WHO grade 4 are the most aggressive primary brain tumors and represent a particular challenge for therapeutic intervention. METHODS: In a single-center retrospective study of matched pairs of initial and post-therapeutic glioma cases with a recurrence period greater than 1 year, we performed whole exome sequencing combined with mRNA and microRNA expression profiling to identify processes that are altered in recurrent gliomas. RESULTS: Mutational analysis of recurrent gliomas revealed early branching evolution in 75% of the patients. High plasticity was confirmed at the mRNA and miRNA levels. SBS1 signature was reduced and SBS11 was elevated, demonstrating the effect of alkylating agent therapy on the mutational landscape. There was no evidence for secondary genomic alterations driving therapy resistance. ALK7/ACVR1C and LTBP1 were upregulated, whereas LEFTY2 was downregulated, pointing towards enhanced Tumor Growth Factor ß (TGF-ß) signaling in recurrent gliomas. Consistently, altered microRNA expression profiles pointed towards enhanced Nuclear Factor Kappa B and Wnt signaling that, cooperatively with TGF-ß, induces epithelial to mesenchymal transition (EMT), migration, and stemness. TGF-ß-induced expression of pro-apoptotic proteins and repression of antiapoptotic proteins were uncoupled in the recurrent tumor. CONCLUSIONS: Our results suggest an important role of TGF-ß signaling in recurrent gliomas. This may have clinical implications since TGF-ß inhibitors have entered clinical phase studies and may potentially be used in combination therapy to interfere with chemoradiation resistance. Recurrent gliomas show high incidence of early branching evolution. High tumor plasticity is confirmed at the level of microRNA and mRNA expression profiles.

Subject(s)

Brain Neoplasms , Glioma , MicroRNAs , Humans , Adult , Up-Regulation , Epithelial-Mesenchymal Transition/genetics , Retrospective Studies , Glioma/pathology , Transforming Growth Factor beta/genetics , Transforming Growth Factor beta/metabolism , MicroRNAs/genetics , Recurrence , RNA, Messenger/metabolism , Brain Neoplasms/metabolism , Cell Line, Tumor , Activin Receptors, Type I/genetics , Activin Receptors, Type I/metabolism

12.

SF3B1 facilitates HIF1-signaling and promotes malignancy in pancreatic cancer.

Simmler, Patrik; Cortijo, Cédric; Koch, Lisa Maria; Galliker, Patricia; Angori, Silvia; Bolck, Hella Anna; Mueller, Christina; Vukolic, Ana; Mirtschink, Peter; Christinat, Yann; Davidson, Natalie R; Lehmann, Kjong-Van; Pellegrini, Giovanni; Pauli, Chantal; Lenggenhager, Daniela; Guccini, Ilaria; Ringel, Till; Hirt, Christian; Marquart, Kim Fabiano; Schaefer, Moritz; Rätsch, Gunnar; Peter, Matthias; Moch, Holger; Stoffel, Markus; Schwank, Gerald.

Cell Rep ; 40(8): 111266, 2022 08 23.

Article in English | MEDLINE | ID: mdl-36001976

ABSTRACT

Mutations in the splicing factor SF3B1 are frequently occurring in various cancers and drive tumor progression through the activation of cryptic splice sites in multiple genes. Recent studies also demonstrate a positive correlation between the expression levels of wild-type SF3B1 and tumor malignancy. Here, we demonstrate that SF3B1 is a hypoxia-inducible factor (HIF)-1 target gene that positively regulates HIF1 pathway activity. By physically interacting with HIF1α, SF3B1 facilitates binding of the HIF1 complex to hypoxia response elements (HREs) to activate target gene expression. To further validate the relevance of this mechanism for tumor progression, we show that a reduction in SF3B1 levels via monoallelic deletion of Sf3b1 impedes tumor formation and progression via impaired HIF signaling in a mouse model for pancreatic cancer. Our work uncovers an essential role of SF3B1 in HIF1 signaling, thereby providing a potential explanation for the link between high SF3B1 expression and aggressiveness of solid tumors.

Subject(s)

Pancreatic Neoplasms , Signal Transduction , Animals , Cell Line, Tumor , Hypoxia/metabolism , Hypoxia-Inducible Factor 1/metabolism , Hypoxia-Inducible Factor 1, alpha Subunit/genetics , Hypoxia-Inducible Factor 1, alpha Subunit/metabolism , Mice , Pancreatic Neoplasms/genetics , Phosphoproteins/genetics , Phosphoproteins/metabolism , RNA Splice Sites , RNA Splicing Factors/genetics , RNA Splicing Factors/metabolism , Pancreatic Neoplasms

13.

A Rapid Translational Immune Response Program in CD8 Memory T Lymphocytes.

Salloum, Darin; Singh, Kamini; Davidson, Natalie R; Cao, Linlin; Kuo, David; Sanghvi, Viraj R; Jiang, Man; Lafoz, Maria Tello; Viale, Agnes; Ratsch, Gunnar; Wendel, Hans-Guido.

J Immunol ; 209(6): 1189-1199, 2022 09 15.

Article in English | MEDLINE | ID: mdl-36002234

ABSTRACT

The activation of memory T cells is a very rapid and concerted cellular response that requires coordination between cellular processes in different compartments and on different time scales. In this study, we use ribosome profiling and deep RNA sequencing to define the acute mRNA translation changes in CD8 memory T cells following initial activation events. We find that initial translation enables subsequent events of human and mouse T cell activation and expansion. Briefly, early events in the activation of Ag-experienced CD8 T cells are insensitive to transcriptional blockade with actinomycin D, and instead depend on the translation of pre-existing mRNAs and are blocked by cycloheximide. Ribosome profiling identifies â¼92 mRNAs that are recruited into ribosomes following CD8 T cell stimulation. These mRNAs typically have structured GC and pyrimidine-rich 5' untranslated regions and they encode key regulators of T cell activation and proliferation such as Notch1, Ifngr1, Il2rb, and serine metabolism enzymes Psat1 and Shmt2 (serine hydroxymethyltransferase 2), as well as translation factors eEF1a1 (eukaryotic elongation factor α1) and eEF2 (eukaryotic elongation factor 2). The increased production of receptors of IL-2 and IFN-Î³ precedes the activation of gene expression and augments cellular signals and T cell activation. Taken together, we identify an early RNA translation program that acts in a feed-forward manner to enable the rapid and dramatic process of CD8 memory T cell expansion and activation.

Subject(s)

Glycine Hydroxymethyltransferase , Interleukin-2 , 5' Untranslated Regions , Animals , CD8-Positive T-Lymphocytes , Cycloheximide/metabolism , Dactinomycin/metabolism , Glycine Hydroxymethyltransferase/genetics , Glycine Hydroxymethyltransferase/metabolism , Humans , Immunologic Memory , Interleukin-2/metabolism , Lymphocyte Activation , Memory T Cells , Mice , Peptide Elongation Factor 2/genetics , Peptide Elongation Factor 2/metabolism , Peptide Elongation Factors/genetics , Pyrimidines/metabolism , RNA, Messenger/genetics , Serine/genetics

14.

RNA Instant Quality Check: Alignment-Free RNA-Degradation Detection.

Lehmann, Kjong-van; Kahles, Andre; Murr, Magdalena; Rätsch, Gunnar.

J Comput Biol ; 29(8): 857-866, 2022 08.

Article in English | MEDLINE | ID: mdl-35776515

ABSTRACT

With the constant increase of large-scale genomic data projects, automated and high-throughput quality assessment becomes a crucial component of any analysis. Whereas small projects often have a more homogeneous design and a manageable structure allowing for a manual per-sample analysis of quality, large-scale studies tend to be much more heterogeneous and complex. Many quality metrics have been developed to assess the quality of an individual sample on the raw read level. Degradation effects are typically assessed based on the RNA integrity (RIN) score, or on postalignment data. In this study, we show that single commonly used quality criteria such as the RIN score alone are not sufficient to ensure RNA sample quality. We developed a new approach and provide an efficient tool that estimates RNA sample degradation by computing the 5'/3' bias based on all genes in an alignment-free manner. That enables degradation assessment right after data generation and not during the analysis procedure allowing for early intervention in the sample handling process. Our analysis shows that this strategy is fast, robust to annotation and differences in library size, and provides complementary quality information to RIN scores enabling the accurate identification of degraded samples.

Subject(s)

RNA Stability , RNA , Genomics , RNA/chemistry , RNA/genetics , Sequence Analysis, RNA/methods

15.

SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing.

Rozhonová, Hana; Danciu, Daniel; Stark, Stefan; Rätsch, Gunnar; Kahles, André; Lehmann, Kjong-Van.

Bioinformatics ; 38(18): 4293-4300, 2022 09 15.

Article in English | MEDLINE | ID: mdl-35900151

ABSTRACT

MOTIVATION: Several recently developed single-cell DNA sequencing technologies enable whole-genome sequencing of thousands of cells. However, the ultra-low coverage of the sequenced data (<0.05× per cell) mostly limits their usage to the identification of copy number alterations in multi-megabase segments. Many tumors are not copy number-driven, and thus single-nucleotide variant (SNV)-based subclone detection may contribute to a more comprehensive view on intra-tumor heterogeneity. Due to the low coverage of the data, the identification of SNVs is only possible when superimposing the sequenced genomes of hundreds of genetically similar cells. Thus, we have developed a new approach to efficiently cluster tumor cells based on a Bayesian filtering approach of relevant loci and exploiting read overlap and phasing. RESULTS: We developed Single Cell Data Tumor Clusterer (SECEDO, lat. 'to separate'), a new method to cluster tumor cells based solely on SNVs, inferred on ultra-low coverage single-cell DNA sequencing data. We applied SECEDO to a synthetic dataset simulating 7250 cells and eight tumor subclones from a single patient and were able to accurately reconstruct the clonal composition, detecting 92.11% of the somatic SNVs, with the smallest clusters representing only 6.9% of the total population. When applied to five real single-cell sequencing datasets from a breast cancer patient, each consisting of ≈2000 cells, SECEDO was able to recover the major clonal composition in each dataset at the original coverage of 0.03×, achieving an Adjusted Rand Index (ARI) score of ≈0.6. The current state-of-the-art SNV-based clustering method achieved an ARI score of ≈0, even after merging cells to create higher coverage data (factor 10 increase), and was only able to match SECEDOs performance when pooling data from all five datasets, in addition to artificially increasing the sequencing coverage by a factor of 7. Variant calling on the resulting clusters recovered more than twice as many SNVs as would have been detected if calling on all cells together. Further, the allelic ratio of the called SNVs on each subcluster was more than double relative to the allelic ratio of the SNVs called without clustering, thus demonstrating that calling variants on subclones, in addition to both increasing sensitivity of SNV detection and attaching SNVs to subclones, significantly increases the confidence of the called variants. AVAILABILITY AND IMPLEMENTATION: SECEDO is implemented in C++ and is publicly available at https://github.com/ratschlab/secedo. Instructions to download the data and the evaluation code to reproduce the findings in this paper are available at: https://github.com/ratschlab/secedo-evaluation. The code and data of the submitted version are archived at: https://doi.org/10.5281/zenodo.6516955. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

High-Throughput Nucleotide Sequencing , Neoplasms , Humans , High-Throughput Nucleotide Sequencing/methods , Bayes Theorem , Sequence Analysis, DNA , Genome , Base Sequence , Neoplasms/genetics , Polymorphism, Single Nucleotide

16.

Identification, Quantification, and Testing of Alternative Splicing Events from RNA-Seq Data Using SplAdder.

Markolin, Philipp; Rätsch, Gunnar; Kahles, André.

Methods Mol Biol ; 2493: 167-193, 2022.

Article in English | MEDLINE | ID: mdl-35751815

ABSTRACT

Alternative splicing (AS) is a regulatory process during mRNA maturation that shapes higher eukaryotes' complex transcriptomes. High-throughput sequencing of RNA (RNA-Seq) allows for measurements of AS transcripts at an unprecedented depth and diversity. The ever-expanding catalog of known AS events provides biological insights into gene regulation, population genetics, or in the context of disease. Here, we present an overview on the usage of SplAdder, a graph-based alternative splicing toolbox, which can integrate an arbitrarily large number of RNA-Seq alignments and a given annotation file to augment the shared annotation based on RNA-Seq evidence. The shared augmented annotation graph is then used to identify, quantify, and confirm alternative splicing events based on the RNA-Seq data. Splice graphs for individual alignments can also be tested for significant quantitative differences between other samples or groups of samples.

Subject(s)

Alternative Splicing , RNA , High-Throughput Nucleotide Sequencing , RNA/genetics , RNA-Seq , Sequence Analysis, RNA

17.

Lossless indexing with counting de Bruijn graphs.

Karasikov, Mikhail; Mustafa, Harun; Rätsch, Gunnar; Kahles, André.

Genome Res ; 2022 May 24.

Article in English | MEDLINE | ID: mdl-35609994

ABSTRACT

Sequencing data are rapidly accumulating in public repositories. Making this resource accessible for interactive analysis at scale requires efficient approaches for its storage and indexing. There have recently been remarkable advances in building compressed representations of annotated (or colored) de Bruijn graphs for efficiently indexing k-mer sets. However, approaches for representing quantitative attributes such as gene expression or genome positions in a general manner have remained underexplored. In this work, we propose counting de Bruijn graphs, a notion generalizing annotated de Bruijn graphs by supplementing each node-label relation with one or many attributes (e.g., a k-mer count or its positions). Counting de Bruijn graphs index k-mer abundances from 2652 human RNA-seq samples in over eightfold smaller representations compared with state-of-the-art bioinformatics tools and is faster to construct and query. Furthermore, counting de Bruijn graphs with positional annotations losslessly represent entire reads in indexes on average 27% smaller than the input compressed with gzip for human Illumina RNA-seq and 57% smaller for Pacific Biosciences (PacBio) HiFi sequencing of viral samples. A complete searchable index of all viral PacBio SMRT reads from NCBI's Sequence Read Archive (SRA) (152,884 samples, 875 Gbp) comprises only 178 GB. Finally, on the full RefSeq collection, we generate a lossless and fully queryable index that is 4.6-fold smaller than the MegaBLAST index. The techniques proposed in this work naturally complement existing methods and tools using de Bruijn graphs, and significantly broaden their applicability: from indexing k-mer counts and genome positions to implementing novel sequence alignment algorithms on top of highly compressed graph-based sequence indexes.

18.

Topology-based sparsification of graph annotations.

Danciu, Daniel; Karasikov, Mikhail; Mustafa, Harun; Kahles, André; Rätsch, Gunnar.

Bioinformatics ; 37(Suppl_1): i169-i176, 2021 07 12.

Article in English | MEDLINE | ID: mdl-34252940

ABSTRACT

MOTIVATION: Since the amount of published biological sequencing data is growing exponentially, efficient methods for storing and indexing this data are more needed than ever to truly benefit from this invaluable resource for biomedical research. Labeled de Bruijn graphs are a frequently-used approach for representing large sets of sequencing data. While significant progress has been made to succinctly represent the graph itself, efficient methods for storing labels on such graphs are still rapidly evolving. RESULTS: In this article, we present RowDiff, a new technique for compacting graph labels by leveraging expected similarities in annotations of vertices adjacent in the graph. RowDiff can be constructed in linear time relative to the number of vertices and labels in the graph, and in space proportional to the graph size. In addition, construction can be efficiently parallelized and distributed, making the technique applicable to graphs with trillions of nodes. RowDiff can be viewed as an intermediary sparsification step of the original annotation matrix and can thus naturally be combined with existing generic schemes for compressed binary matrices. Experiments on 10 000 RNA-seq datasets show that RowDiff combined with multi-BRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most compact annotation representation. Experiments on the sparser Fungi subset of the RefSeq collection show that applying RowDiff sparsification reduces the size of individual annotation columns stored as compressed bit vectors by an average factor of 42. When combining RowDiff with a multi-BRWT representation, the resulting annotation is 26 times smaller than Mantis-MST. AVAILABILITY AND IMPLEMENTATION: RowDiff is implemented in C++ within the MetaGraph framework. The source code and the data used in the experiments are publicly available at https://github.com/ratschlab/row_diff.

Subject(s)

Algorithms , Biomedical Research , Software

19.

Targeting eIF4A-Dependent Translation of KRAS Signaling Molecules.

Singh, Kamini; Lin, Jianan; Lecomte, Nicolas; Mohan, Prathibha; Gokce, Askan; Sanghvi, Viraj R; Jiang, Man; Grbovic-Huezo, Olivera; Burcul, Antonija; Stark, Stefan G; Romesser, Paul B; Chang, Qing; Melchor, Jerry P; Beyer, Rachel K; Duggan, Mark; Fukase, Yoshiyuki; Yang, Guangli; Ouerfelli, Ouathek; Viale, Agnes; de Stanchina, Elisa; Stamford, Andrew W; Meinke, Peter T; Rätsch, Gunnar; Leach, Steven D; Ouyang, Zhengqing; Wendel, Hans-Guido.

Cancer Res ; 81(8): 2002-2014, 2021 04 15.

Article in English | MEDLINE | ID: mdl-33632898

ABSTRACT

Pancreatic adenocarcinoma (PDAC) epitomizes a deadly cancer driven by abnormal KRAS signaling. Here, we show that the eIF4A RNA helicase is required for translation of key KRAS signaling molecules and that pharmacological inhibition of eIF4A has single-agent activity against murine and human PDAC models at safe dose levels. EIF4A was uniquely required for the translation of mRNAs with long and highly structured 5' untranslated regions, including those with multiple G-quadruplex elements. Computational analyses identified these features in mRNAs encoding KRAS and key downstream molecules. Transcriptome-scale ribosome footprinting accurately identified eIF4A-dependent mRNAs in PDAC, including critical KRAS signaling molecules such as PI3K, RALA, RAC2, MET, MYC, and YAP1. These findings contrast with a recent study that relied on an older method, polysome fractionation, and implicated redox-related genes as eIF4A clients. Together, our findings highlight the power of ribosome footprinting in conjunction with deep RNA sequencing in accurately decoding translational control mechanisms and define the therapeutic mechanism of eIF4A inhibitors in PDAC. SIGNIFICANCE: These findings document the coordinate, eIF4A-dependent translation of RAS-related oncogenic signaling molecules and demonstrate therapeutic efficacy of eIF4A blockade in pancreatic adenocarcinoma.

Subject(s)

Adenocarcinoma/metabolism , Eukaryotic Initiation Factor-4A/metabolism , Pancreatic Neoplasms/metabolism , Proto-Oncogene Proteins p21(ras)/metabolism , RNA, Messenger/metabolism , Ribosomes/metabolism , 5' Untranslated Regions , Adaptor Proteins, Signal Transducing/genetics , Adaptor Proteins, Signal Transducing/metabolism , Adenocarcinoma/drug therapy , Animals , Cell Line, Tumor , Cycloheximide/pharmacology , Eukaryotic Initiation Factor-4A/antagonists & inhibitors , G-Quadruplexes , Genes, ras/genetics , Humans , Mice , Mice, Nude , Mutation , Neoplasm Transplantation , Oxidation-Reduction , Pancreatic Neoplasms/drug therapy , Phosphatidylinositol 3-Kinases/genetics , Phosphatidylinositol 3-Kinases/metabolism , Polyribosomes/metabolism , Protein Biosynthesis , Protein Synthesis Inhibitors/pharmacology , Proto-Oncogene Proteins c-met/genetics , Proto-Oncogene Proteins c-met/metabolism , Proto-Oncogene Proteins c-myc/genetics , Proto-Oncogene Proteins c-myc/metabolism , RNA Helicases , Sequence Analysis, RNA , Transcription Factors/genetics , Transcription Factors/metabolism , Transcriptome , Triterpenes/pharmacology , YAP-Signaling Proteins , rac GTP-Binding Proteins/genetics , rac GTP-Binding Proteins/metabolism , ral GTP-Binding Proteins/genetics , ral GTP-Binding Proteins/metabolism , RAC2 GTP-Binding Protein

20.

The Tumor Profiler Study: integrated, multi-omic, functional tumor profiling for clinical decision support.

Irmisch, Anja; Bonilla, Ximena; Chevrier, Stéphane; Lehmann, Kjong-Van; Singer, Franziska; Toussaint, Nora C; Esposito, Cinzia; Mena, Julien; Milani, Emanuela S; Casanova, Ruben; Stekhoven, Daniel J; Wegmann, Rebekka; Jacob, Francis; Sobottka, Bettina; Goetze, Sandra; Kuipers, Jack; Sarabia Del Castillo, Jacobo; Prummer, Michael; Tuncel, Mustafa A; Menzel, Ulrike; Jacobs, Andrea; Engler, Stefanie; Sivapatham, Sujana; Frei, Anja L; Gut, Gabriele; Ficek, Joanna; Miglino, Nicola; Aebersold, Rudolf; Bacac, Marina; Beerenwinkel, Niko; Beisel, Christian; Bodenmiller, Bernd; Dummer, Reinhard; Heinzelmann-Schwarz, Viola; Koelzer, Viktor H; Manz, Markus G; Moch, Holger; Pelkmans, Lucas; Snijder, Berend; Theocharides, Alexandre P A; Tolnay, Markus; Wicki, Andreas; Wollscheid, Bernd; Rätsch, Gunnar; Levesque, Mitchell P.

Cancer Cell ; 39(3): 288-293, 2021 03 08.

Article in English | MEDLINE | ID: mdl-33482122

ABSTRACT

The application and integration of molecular profiling technologies create novel opportunities for personalized medicine. Here, we introduce the Tumor Profiler Study, an observational trial combining a prospective diagnostic approach to assess the relevance of in-depth tumor profiling to support clinical decision-making with an exploratory approach to improve the biological understanding of the disease.

Subject(s)

Neoplasms/genetics , Neoplasms/metabolism , Clinical Decision-Making/methods , Computational Biology/methods , Decision Support Systems, Clinical , Humans , Precision Medicine/methods , Prospective Studies

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL