Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
Bioinformatics ; 40(3)2024 Mar 04.
Article in English | MEDLINE | ID: mdl-38379414

ABSTRACT

MOTIVATION: The process of analyzing high throughput sequencing data often requires the identification and extraction of specific target sequences. This could include tasks, such as identifying cellular barcodes and UMIs in single-cell data, and specific genetic variants for genotyping. However, existing tools, which perform these functions are often task-specific, such as only demultiplexing barcodes for a dedicated type of experiment, or are not tolerant to noise in the sequencing data. RESULTS: To overcome these limitations, we developed Flexiplex, a versatile and fast sequence searching and demultiplexing tool for omics data, which is based on the Levenshtein distance and thus allows imperfect matches. We demonstrate Flexiplex's application on three use cases, identifying cell-line-specific sequences in Illumina short-read single-cell data, and discovering and demultiplexing cellular barcodes from noisy long-read single-cell RNA-seq data. We show that Flexiplex achieves an excellent balance of accuracy and computational efficiency compared to leading task-specific tools. AVAILABILITY AND IMPLEMENTATION: Flexiplex is available at https://davidsongroup.github.io/flexiplex/.


Subject(s)
Search Engine , Software , Sequence Analysis, DNA , High-Throughput Nucleotide Sequencing , Electronic Data Processing
3.
Nat Commun ; 14(1): 3403, 2023 06 09.
Article in English | MEDLINE | ID: mdl-37296101

ABSTRACT

Squamous cell carcinoma antigen recognized by T cells 3 (SART3) is an RNA-binding protein with numerous biological functions including recycling small nuclear RNAs to the spliceosome. Here, we identify recessive variants in SART3 in nine individuals presenting with intellectual disability, global developmental delay and a subset of brain anomalies, together with gonadal dysgenesis in 46,XY individuals. Knockdown of the Drosophila orthologue of SART3 reveals a conserved role in testicular and neuronal development. Human induced pluripotent stem cells carrying patient variants in SART3 show disruption to multiple signalling pathways, upregulation of spliceosome components and demonstrate aberrant gonadal and neuronal differentiation in vitro. Collectively, these findings suggest that bi-allelic SART3 variants underlie a spliceosomopathy which we tentatively propose be termed INDYGON syndrome (Intellectual disability, Neurodevelopmental defects and Developmental delay with 46,XY GONadal dysgenesis). Our findings will enable additional diagnoses and improved outcomes for individuals born with this condition.


Subject(s)
Gonadal Dysgenesis , Induced Pluripotent Stem Cells , Intellectual Disability , Male , Humans , Testis/metabolism , Induced Pluripotent Stem Cells/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Antigens, Neoplasm
6.
Blood Adv ; 6(7): 2373-2387, 2022 04 12.
Article in English | MEDLINE | ID: mdl-35061886

ABSTRACT

Philadelphia-like (Ph-like) acute lymphoblastic leukemia (ALL) is a high-risk subtype of B-cell ALL characterized by a gene expression profile resembling Philadelphia chromosome-positive ALL (Ph+ ALL) in the absence of BCR-ABL1. Tyrosine kinase-activating fusions, some involving ABL1, are recurrent drivers of Ph-like ALL and are targetable with tyrosine kinase inhibitors (TKIs). We identified a rare instance of SFPQ-ABL1 in a child with Ph-like ALL. SFPQ-ABL1 expressed in cytokine-dependent cell lines was sufficient to transform cells and these cells were sensitive to ABL1-targeting TKIs. In contrast to BCR-ABL1, SFPQ-ABL1 localized to the nuclear compartment and was a weaker driver of cellular proliferation. Phosphoproteomics analysis showed upregulation of cell cycle, DNA replication, and spliceosome pathways, and downregulation of signal transduction pathways, including ErbB, NF-κB, vascular endothelial growth factor (VEGF), and MAPK signaling in SFPQ-ABL1-expressing cells compared with BCR-ABL1-expressing cells. SFPQ-ABL1 expression did not activate phosphatidylinositol 3-kinase/protein kinase B (PI3K/AKT) signaling and was associated with phosphorylation of G2/M cell cycle proteins. SFPQ-ABL1 was sensitive to navitoclax and S-63845 and promotes cell survival by maintaining expression of Mcl-1 and Bcl-xL. SFPQ-ABL1 has functionally distinct mechanisms by which it drives ALL, including subcellular localization, proliferative capacity, and activation of cellular pathways. These findings highlight the role that fusion partners have in mediating the function of ABL1 fusions.


Subject(s)
Phosphatidylinositol 3-Kinases , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Child , Fusion Proteins, bcr-abl/genetics , Fusion Proteins, bcr-abl/metabolism , Humans , Phosphatidylinositol 3-Kinases/metabolism , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Precursor Cell Lymphoblastic Leukemia-Lymphoma/metabolism , Protein Kinase Inhibitors/pharmacology , Signal Transduction , Vascular Endothelial Growth Factor A
7.
Genome Biol ; 23(1): 10, 2022 01 06.
Article in English | MEDLINE | ID: mdl-34991664

ABSTRACT

In cancer, fusions are important diagnostic markers and targets for therapy. Long-read transcriptome sequencing allows the discovery of fusions with their full-length isoform structure. However, due to higher sequencing error rates, fusion finding algorithms designed for short reads do not work. Here we present JAFFAL, to identify fusions from long-read transcriptome sequencing. We validate JAFFAL using simulations, cell lines, and patient data from Nanopore and PacBio. We apply JAFFAL to single-cell data and find fusions spanning three genes demonstrating transcripts detected from complex rearrangements. JAFFAL is available at https://github.com/Oshlack/JAFFA/wiki .


Subject(s)
High-Throughput Nucleotide Sequencing , Transcriptome , Algorithms , Gene Fusion , Humans , Sequence Analysis, DNA
8.
Genome Biol ; 22(1): 296, 2021 10 22.
Article in English | MEDLINE | ID: mdl-34686194

ABSTRACT

Calling fusion genes from RNA-seq data is well established, but other transcriptional variants are difficult to detect using existing approaches. To identify all types of variants in transcriptomes we developed MINTIE, an integrated pipeline for RNA-seq data. We take a reference-free approach, combining de novo assembly of transcripts with differential expression analysis to identify up-regulated novel variants in a case sample. We compare MINTIE with eight other approaches, detecting > 85% of variants while no other method is able to achieve this. We posit that MINTIE will be able to identify new disease variants across a range of disease types.


Subject(s)
RNA Splicing , RNA-Seq , Software , Transcriptome , Algorithms , Genetic Variation , Humans , Precursor B-Cell Lymphoblastic Leukemia-Lymphoma/genetics , Rare Diseases/genetics
9.
Development ; 147(20)2020 10 29.
Article in English | MEDLINE | ID: mdl-33028609

ABSTRACT

The genetic regulatory network controlling early fate choices during human blood cell development are not well understood. We used human pluripotent stem cell reporter lines to track the development of endothelial and haematopoietic populations in an in vitro model of human yolk-sac development. We identified SOX17-CD34+CD43- endothelial cells at day 2 of blast colony development, as a haemangioblast-like branch point from which SOX17-CD34+CD43+ blood cells and SOX17+CD34+CD43- endothelium subsequently arose. Most human blood cell development was dependent on RUNX1. Deletion of RUNX1 only permitted a single wave of yolk sac-like primitive erythropoiesis, but no yolk sac myelopoiesis or aorta-gonad-mesonephros (AGM)-like haematopoiesis. Blocking GFI1 and/or GFI1B activity with a small molecule inhibitor abrogated all blood cell development, even in cell lines with an intact RUNX1 gene. Together, our data define the hierarchical requirements for RUNX1, GFI1 and/or GFI1B during early human haematopoiesis arising from a yolk sac-like SOX17-negative haemogenic endothelial intermediate.


Subject(s)
Blood Cells/metabolism , Core Binding Factor Alpha 2 Subunit/metabolism , DNA-Binding Proteins/metabolism , Endothelium/metabolism , Hematopoiesis , Proto-Oncogene Proteins/metabolism , Repressor Proteins/metabolism , SOXF Transcription Factors/metabolism , Transcription Factors/metabolism , Yolk Sac/metabolism , Blood Cells/cytology , Cell Differentiation , Cell Lineage , Erythroid Cells/cytology , Erythroid Cells/metabolism , Histone Demethylases/antagonists & inhibitors , Histone Demethylases/metabolism , Humans , Models, Biological , Transcription, Genetic
11.
Blood Adv ; 4(5): 930-942, 2020 03 10.
Article in English | MEDLINE | ID: mdl-32150610

ABSTRACT

Acute lymphoblastic leukemia (ALL) is the most common childhood malignancy, and implementation of risk-adapted therapy has been instrumental in the dramatic improvements in clinical outcomes. A key to risk-adapted therapies includes the identification of genomic features of individual tumors, including chromosome number (for hyper- and hypodiploidy) and gene fusions, notably ETV6-RUNX1, TCF3-PBX1, and BCR-ABL1 in B-cell ALL (B-ALL). RNA-sequencing (RNA-seq) of large ALL cohorts has expanded the number of recurrent gene fusions recognized as drivers in ALL, and identification of these new entities will contribute to refining ALL risk stratification. We used RNA-seq on 126 ALL patients from our clinical service to test the utility of including RNA-seq in standard-of-care diagnostic pipelines to detect gene rearrangements and IKZF1 deletions. RNA-seq identified 86% of rearrangements detected by standard-of-care diagnostics. KMT2A (MLL) rearrangements, although usually identified, were the most commonly missed by RNA-seq as a result of low expression. RNA-seq identified rearrangements that were not detected by standard-of-care testing in 9 patients. These were found in patients who were not classifiable using standard molecular assessment. We developed an approach to detect the most common IKZF1 deletion from RNA-seq data and validated this using an RQ-PCR assay. We applied an expression classifier to identify Philadelphia chromosome-like B-ALL patients. T-ALL proved a rich source of novel gene fusions, which have clinical implications or provide insights into disease biology. Our experience shows that RNA-seq can be implemented within an individual clinical service to enhance the current molecular diagnostic risk classification of ALL.


Subject(s)
Oncogene Proteins, Fusion , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Child , Gene Rearrangement , Genomics , Humans , Oncogene Proteins, Fusion/genetics , Precursor Cell Lymphoblastic Leukemia-Lymphoma/diagnosis , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Sequence Analysis, RNA
12.
F1000Res ; 8: 265, 2019.
Article in English | MEDLINE | ID: mdl-31143443

ABSTRACT

Background: RNA sequencing has enabled high-throughput and fine-grained quantitative analyses of the transcriptome. While differential gene expression is the most widely used application of this technology, RNA-seq data also has the resolution to infer differential transcript usage (DTU), which can elucidate the role of different transcript isoforms between experimental conditions, cell types or tissues. DTU has typically been inferred from exon-count data, which has issues with assigning reads unambiguously to counting bins, and requires alignment of reads to the genome. Recently, approaches have emerged that use transcript quantifications estimates directly for DTU. Transcript counts can be inferred from 'pseudo' or lightweight aligners, which are significantly faster than traditional genome alignment. However, recent evaluations show lower sensitivity in DTU analysis. Transcript abundances are estimated from equivalence classes (ECs), which determine the transcripts that any given read is compatible with. Recent work has proposed performing differential expression testing directly on equivalence class read counts (ECs). Methods: Here we demonstrate that ECs can be used effectively with existing count-based methods for detecting DTU. We evaluate this approach on simulated human and drosophila data, as well as on a real dataset through subset testing. Results: We find that ECs counts have similar sensitivity and false discovery rates as exon-level counts but can be generated in a fraction of the time through the use of pseudo-aligners. Conclusions: We posit that equivalence class read counts are a natural unit on which to perform many types of analysis.


Subject(s)
Gene Expression Profiling , Protein Isoforms , Transcriptome , Animals , Exons , Humans , Mice , Sequence Analysis, RNA
13.
Pediatr Blood Cancer ; 66(10): e27897, 2019 10.
Article in English | MEDLINE | ID: mdl-31250523

ABSTRACT

We report two patients with leukaemia driven by the rare CNTRL-FGFR1 fusion oncogene. This fusion arises from a t(8;9)(p12;q33) translocation, and is a rare driver of biphenotypic leukaemia in children. We used RNA sequencing to report novel features of expressed CNTRL-FGFR1, including CNTRL-FGFR1 fusion alternative splicing. From this knowledge, we designed and tested a Droplet Digital PCR assay that detects CNTRL-FGFR1 expression to approximately one cell in 100 000 using fusion breakpoint-specific primers and probes. We also utilised cell-line models to show that effective tyrosine kinase inhibitors, which may be included in treatment regimens for this disease, are only those that block FGFR1 phosphorylation.


Subject(s)
Cell Cycle Proteins/genetics , Leukemia/genetics , Leukemia/therapy , Molecular Targeted Therapy/methods , Receptor, Fibroblast Growth Factor, Type 1/genetics , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Child , Humans , Infant , Male , Oncogene Fusion , Oncogene Proteins, Fusion/genetics , Polymerase Chain Reaction/methods , Protein Kinase Inhibitors/therapeutic use
14.
Gigascience ; 7(7)2018 07 01.
Article in English | MEDLINE | ID: mdl-29982439

ABSTRACT

Background: Genomic profiling efforts have revealed a rich diversity of oncogenic fusion genes. While there are many methods for identifying fusion genes from RNA-sequencing (RNA-seq) data, visualizing these transcripts and their supporting reads remains challenging. Findings: Clinker is a bioinformatics tool written in Python, R, and Bpipe that leverages the superTranscript method to visualize fusion genes. We demonstrate the use of Clinker to obtain interpretable visualizations of the RNA-seq data that lead to fusion calls. In addition, we use Clinker to explore multiple fusion transcripts with novel breakpoints within the P2RY8-CRLF2 fusion gene in B-cell acute lymphoblastic leukemia. Conclusions: Clinker is freely available software that allows visualization of fusion genes and the RNA-seq data used in their discovery.


Subject(s)
Computational Biology/methods , Oncogene Proteins, Fusion/genetics , Sequence Analysis, RNA/methods , Software , Alternative Splicing , Frameshift Mutation , Gene Expression Profiling , Genomics , Humans , Leukemia, B-Cell/genetics , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Programming Languages , Protein Domains , Protein Isoforms , RNA/genetics , Receptors, Cytokine/genetics , Receptors, Purinergic P2Y/genetics
15.
Gigascience ; 7(5)2018 05 01.
Article in English | MEDLINE | ID: mdl-29722876

ABSTRACT

Background: RNA sequencing (RNA-seq) analyses can benefit from performing a genome-guided and de novo assembly, in particular for species where the reference genome or the annotation is incomplete. However, tools for integrating an assembled transcriptome with reference annotation are lacking. Findings: Necklace is a software pipeline that runs genome-guided and de novo assembly and combines the resulting transcriptomes with reference genome annotations. Necklace constructs a compact but comprehensive superTranscriptome out of the assembled and reference data. Reads are subsequently aligned and counted in preparation for differential expression testing. Conclusions: Necklace allows a comprehensive transcriptome to be built from a combination of assembled and annotated transcripts, which results in a more comprehensive transcriptome for the majority of organisms. In addition RNA-seq data are mapped back to this newly created superTranscript reference to enable differential expression testing with standard methods.


Subject(s)
Sequence Analysis, RNA/methods , Software , Transcriptome/genetics , Animals , Databases, Genetic , Molecular Sequence Annotation , RNA, Messenger/genetics , RNA, Messenger/metabolism , Reference Standards , Sheep/genetics
17.
Genome Biol ; 18(1): 148, 2017 Aug 04.
Article in English | MEDLINE | ID: mdl-28778180

ABSTRACT

Numerous methods have been developed to analyse RNA sequencing (RNA-seq) data, but most rely on the availability of a reference genome, making them unsuitable for non-model organisms. Here we present superTranscripts, a substitute for a reference genome, where each gene with multiple transcripts is represented by a single sequence. The Lace software is provided to construct superTranscripts from any set of transcripts, including de novo assemblies. We demonstrate how superTranscripts enable visualisation, variant detection and differential isoform detection in non-model organisms. We further use Lace to combine reference and assembled transcriptomes for chicken and recover hundreds of gaps in the reference genome.

18.
Nat Commun ; 8(1): 132, 2017 07 25.
Article in English | MEDLINE | ID: mdl-28743862

ABSTRACT

The ratites are a distinctive clade of flightless birds, typified by the emu and ostrich that have acquired a range of unique anatomical characteristics since diverging from basal Aves at least 100 million years ago. The emu possesses a vestigial wing with a single digit and greatly reduced forelimb musculature. However, the embryological basis of wing reduction and other anatomical changes associated with loss of flight are unclear. Here we report a previously unknown co-option of the cardiac transcription factor Nkx2.5 to the forelimb in the emu embryo, but not in ostrich, or chicken and zebra finch, which have fully developed wings. Nkx2.5 is expressed in emu limb bud mesenchyme and maturing wing muscle, and mis-expression of Nkx2.5 throughout the limb bud in chick results in wing reductions. We propose that Nkx2.5 functions to inhibit early limb bud expansion and later muscle growth during development of the vestigial emu wing.The transcription factor Nkx2.5 is essential for heart development. Here, the authors identify a previously unknown expression domain for Nkx2.5 in the emu wing and explore its role in diminished wing bud development in the flightless emu, compared with three other birds that have functional wings.


Subject(s)
Avian Proteins/genetics , Homeobox Protein Nkx-2.5/genetics , Transcription Factors/genetics , Wings, Animal/metabolism , Animals , Avian Proteins/metabolism , Dromaiidae , Forelimb/embryology , Forelimb/metabolism , Gene Expression Profiling/methods , Gene Expression Regulation, Developmental , In Situ Hybridization , Limb Buds/embryology , Limb Buds/metabolism , Mesoderm/embryology , Mesoderm/metabolism , Muscle, Skeletal/embryology , Muscle, Skeletal/metabolism , Myocardium/metabolism , Reverse Transcriptase Polymerase Chain Reaction , Wings, Animal/embryology
19.
Evodevo ; 7: 26, 2016.
Article in English | MEDLINE | ID: mdl-28031782

ABSTRACT

BACKGROUND: The forelimb of the flightless emu is a vestigial structure, with greatly reduced wing elements and digit loss. To explore the molecular and cellular mechanisms associated with the evolution of vestigial wings and loss of flight in the emu, key limb patterning genes were examined in developing embryos. METHODS: Limb development was compared in emu versus chicken embryos. Immunostaining for cell proliferation markers was used to analyze growth of the emu forelimb and hindlimb buds. Expression patterns of limb patterning genes were studied, using whole-mount in situ hybridization (for mRNA localization) and RNA-seq (for mRNA expression levels). RESULTS: The forelimb of the emu embryo showed heterochronic development compared to that in the chicken, with the forelimb bud being retarded in its development. Early outgrowth of the emu forelimb bud is characterized by a lower level of cell proliferation compared the hindlimb bud, as assessed by PH3 immunostaining. In contrast, there were no obvious differences in apoptosis in forelimb versus hindlimb buds (cleaved caspase 3 staining). Most key patterning genes were expressed in emu forelimb buds similarly to that observed in the chicken, but with smaller expression domains. However, expression of Sonic Hedgehog (Shh) mRNA, which is central to anterior-posterior axis development, was delayed in the emu forelimb bud relative to other patterning genes. Regulators of Shh expression, Gli3 and HoxD13, also showed altered expression levels in the emu forelimb bud. CONCLUSIONS: These data reveal heterochronic but otherwise normal expression of most patterning genes in the emu vestigial forelimb. Delayed Shh expression may be related to the small and vestigial structure of the emu forelimb bud. However, the genetic mechanism driving retarded emu wing development is likely to rest within the forelimb field of the lateral plate mesoderm, predating the expression of patterning genes.

SELECTION OF CITATIONS
SEARCH DETAIL
...