Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
1.
Cancers (Basel) ; 13(21)2021 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-34771686

RESUMO

Anaplastic large cell lymphomas associated with ALK translocation have a good outcome after CHOP treatment; however, the 2-year relapse rate remains at 30%. Microarray gene-expression profiling of 48 samples obtained at diagnosis was used to identify 47 genes that were differentially expressed between patients with early relapse/progression and no relapse. In the relapsing group, the most significant overrepresented genes were related to the regulation of the immune response and T-cell activation while those in the non-relapsing group were involved in the extracellular matrix. Fluidigm technology gave concordant results for 29 genes, of which FN1, FAM179A, and SLC40A1 had the strongest predictive power after logistic regression and two classification algorithms. In parallel with 39 samples, we used a Kallisto/Sleuth pipeline to analyze RNA sequencing data and identified 20 genes common to the 28 genes validated by Fluidigm technology-notably, the FAM179A and FN1 genes. Interestingly, FN1 also belongs to the gene signature predicting longer survival in diffuse large B-cell lymphomas treated with CHOP. Thus, our molecular signatures indicate that the FN1 gene, a matrix key regulator, might also be involved in the prognosis and the therapeutic response in anaplastic lymphomas.

2.
BMC Genomics ; 22(1): 412, 2021 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-34088266

RESUMO

BACKGROUND: The development of RNA sequencing (RNAseq) and the corresponding emergence of public datasets have created new avenues of transcriptional marker search. The long non-coding RNAs (lncRNAs) constitute an emerging class of transcripts with a potential for high tissue specificity and function. Therefore, we tested the biomarker potential of lncRNAs on Mesenchymal Stem Cells (MSCs), a complex type of adult multipotent stem cells of diverse tissue origins, that is frequently used in clinics but which is lacking extensive characterization. RESULTS: We developed a dedicated bioinformatics pipeline for the purpose of building a cell-specific catalogue of unannotated lncRNAs. The pipeline performs ab initio transcript identification, pseudoalignment and uses new methodologies such as a specific k-mer approach for naive quantification of expression in numerous RNAseq data. We next applied it on MSCs, and our pipeline was able to highlight novel lncRNAs with high cell specificity. Furthermore, with original and efficient approaches for functional prediction, we demonstrated that each candidate represents one specific state of MSCs biology. CONCLUSIONS: We showed that our approach can be employed to harness lncRNAs as cell markers. More specifically, our results suggest different candidates as potential actors in MSCs biology and propose promising directions for future experimental investigations.


Assuntos
Células-Tronco Mesenquimais , RNA Longo não Codificante , Sequência de Bases , Biologia Computacional , RNA Longo não Codificante/genética , Análise de Sequência de RNA
3.
NAR Genom Bioinform ; 3(3): lqab058, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34179780

RESUMO

The huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K-mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k-mers can be used to quantify accurately gene expression in a less resource-consuming way. We present the Kmerator Suite, a set of three tools designed to extract specific k-mer signatures, quantify these k-mers into RNA-seq datasets and quickly visualize large dataset characteristics. The core tool, Kmerator, produces specific k-mers for 97% of human genes, enabling the measure of gene expression with high accuracy in simulated datasets. KmerExploR, a direct application of Kmerator, uses a set of predictor gene-specific k-mers to infer metadata including library protocol, sample features or contaminations from RNA-seq datasets. KmerExploR results are visualized through a user-friendly interface. Moreover, we demonstrate that the Kmerator Suite can be used for advanced queries targeting known or new biomarkers such as mutations, gene fusions or long non-coding RNAs for human health applications.

4.
Stem Cell Reports ; 14(1): 1-8, 2020 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-31902703

RESUMO

Genomic integrity of human pluripotent stem cells (hPSCs) is essential for research and clinical applications. However, genetic abnormalities can accumulate during hPSC generation and routine culture and following gene editing. Their occurrence should be regularly monitored, but the current assays to assess hPSC genomic integrity are not fully suitable for such regular screening. To address this issue, we first carried out a large meta-analysis of all hPSC genetic abnormalities reported in more than 100 publications and identified 738 recurrent genetic abnormalities (i.e., overlapping abnormalities found in at least five distinct scientific publications). We then developed a test based on the droplet digital PCR technology that can potentially detect more than 90% of these hPSC recurrent genetic abnormalities in DNA extracted from culture supernatant samples. This test can be used to routinely screen genomic integrity in hPSCs.


Assuntos
Variação Genética , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/metabolismo , Biomarcadores , Técnicas de Cultura de Células , Diferenciação Celular/genética , Meios de Cultivo Condicionados , Edição de Genes , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunofenotipagem , Reação em Cadeia da Polimerase em Tempo Real
5.
Methods Mol Biol ; 1769: 133-156, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29564822

RESUMO

RNA-Seq approach enables the detection and characterization of fusion or chimeric transcript associated to complex genome rearrangement. Until now, these events are classically identified at DNA level.Here we describe a complete procedure including a novel way of analyzing reads that combines genomic locations and local coverage to directly infer chimeric junctions with a high sensitivity and specificity, allowing identification of different classes of chimeric RNA events. We also recommend the best practices for the bioinformatics analysis and describe the experimental process for RNA validation using real-time PCR and sequencing.


Assuntos
Cromotripsia , Rearranjo Gênico , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de RNA , Transcrição Gênica , Algoritmos , Biologia Computacional/métodos , Biblioteca Gênica , Anotação de Sequência Molecular , Fluxo de Trabalho
6.
Sci Rep ; 8(1): 2202, 2018 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-29396444

RESUMO

Progress in assisted reproductive technologies strongly relies on understanding the regulation of the dialogue between oocyte and cumulus cells (CCs). Little is known about the role of long non-coding RNAs (lncRNAs) in the human cumulus-oocyte complex (COC). To this aim, publicly available RNA-sequencing data were analyzed to identify lncRNAs that were abundant in metaphase II (MII) oocytes (BCAR4, C3orf56, TUNAR, OOEP-AS1, CASC18, and LINC01118) and CCs (NEAT1, MALAT1, ANXA2P2, MEG3, IL6STP1, and VIM-AS1). These data were validated by RT-qPCR analysis using independent oocytes and CC samples. The functions of the identified lncRNAs were then predicted by constructing lncRNA-mRNA co-expression networks. This analysis suggested that MII oocyte lncRNAs could be involved in chromatin remodeling, cell pluripotency and in driving early embryonic development. CC lncRNAs were co-expressed with genes involved in apoptosis and extracellular matrix-related functions. A bioinformatic analysis of RNA-sequencing data to identify CC lncRNAs that are affected by maternal age showed that lncRNAs with age-related altered expression in CCs are essential for oocyte growth. This comprehensive analysis of lncRNAs expressed in human MII oocytes and CCs could provide biomarkers of oocyte quality for the development of non-invasive tests to identify embryos with high developmental potential.


Assuntos
Células do Cúmulo/fisiologia , Perfilação da Expressão Gênica , Oócitos/fisiologia , RNA Longo não Codificante/análise , Biologia Computacional , Humanos , Metáfase , Reação em Cadeia da Polimerase em Tempo Real , Reação em Cadeia da Polimerase Via Transcriptase Reversa
7.
Hepatology ; 68(1): 89-102, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29152775

RESUMO

Surgery and cisplatin-based treatment of hepatoblastoma (HB) currently guarantee the survival of 70%-80% of patients. However, some important challenges remain in diagnosing high-risk tumors and identifying relevant targetable pathways offering new therapeutic avenues. Previously, two molecular subclasses of HB tumors have been described, C1 and C2, with C2 being the subgroup with the poorest prognosis, a more advanced tumor stage, and the worst overall survival rate. An associated 16-gene signature to discriminate the two tumoral subgroups was proposed, but it has not been transferred into clinical routine. To address these issues, we performed RNA sequencing of 25 tumors and matched normal liver samples from patients. The transcript profiling separated HB into three distinct subgroups named C1, C2A, and C2B, identifiable by a concise four-gene signature: hydroxysteroid 17-beta dehydrogenase 6, integrin alpha 6, topoisomerase 2-alpha, and vimentin, with topoisomerase 2-alpha being characteristic for the proliferative C2A tumors. Differential expression of these genes was confirmed by quantitative RT-PCR on an expanded cohort and by immunohistochemistry. We also revealed significant overexpression of genes involved in the Fanconi anemia (FA) pathway in the C2A subgroup. We then investigated the ability of several described FA inhibitors to block growth of HB cells in vitro and in vivo. We demonstrated that bortezomib, a Food and Drug Administration-approved proteasome inhibitor, strongly impairs the proliferation and survival of HB cell lines in vitro, blocks FA pathway-associated double-strand DNA repair, and significantly impedes HB growth in vivo. CONCLUSION: The highly proliferating C2A subtype is characterized by topoisomerase 2-alpha gene up-regulation and FA pathway activation, and the HB therapeutic arsenal could include bortezomib for the treatment of patients with the most aggressive tumors. (Hepatology 2018;68:89-102).


Assuntos
DNA Topoisomerases Tipo II/metabolismo , Hepatoblastoma/classificação , Hepatoblastoma/genética , Neoplasias Hepáticas/classificação , Neoplasias Hepáticas/genética , Proteínas de Ligação a Poli-ADP-Ribose/metabolismo , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Biomarcadores/metabolismo , Bortezomib/farmacologia , Bortezomib/uso terapêutico , Reparo do DNA/efeitos dos fármacos , Proteínas de Grupos de Complementação da Anemia de Fanconi/metabolismo , Perfilação da Expressão Gênica , Células Hep G2 , Hepatoblastoma/tratamento farmacológico , Hepatoblastoma/enzimologia , Humanos , Neoplasias Hepáticas/tratamento farmacológico , Neoplasias Hepáticas/enzimologia , Análise de Sequência de RNA
8.
Genome Biol ; 18(1): 243, 2017 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-29284518

RESUMO

We introduce a k-mer-based computational protocol, DE-kupl, for capturing local RNA variation in a set of RNA-seq libraries, independently of a reference genome or transcriptome. DE-kupl extracts all k-mers with differential abundance directly from the raw data files. This enables the retrieval of virtually all variation present in an RNA-seq data set. This variation is subsequently assigned to biological events or entities such as differential long non-coding RNAs, splice and polyadenylation variants, introns, repeats, editing or mutation events, and exogenous RNA. Applying DE-kupl to human RNA-seq data sets identified multiple types of novel events, reproducibly across independent RNA-seq experiments.


Assuntos
Biologia Computacional/métodos , Variação Genética , RNA/genética , Software , Alelos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Poliadenilação , Splicing de RNA , RNA Antissenso , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Transcriptoma
9.
BMC Bioinformatics ; 18(1): 428, 2017 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-28969586

RESUMO

BACKGROUND: The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured. In order to provide some answers to these questions, we investigate the performance of leading bioinformatic tools designed for RNA-Seq analysis and propose a methodology for systematic evaluation and comparison of performance to help users make well informed choices. RESULTS: To evaluate RNA-Seq pipelines, we developed a suite of two benchmarking tools. SimCT generates simulated datasets that get as close as possible to specific real biological conditions accompanied by the list of genomic incidents and mutations that have been inserted. BenchCT then compares the output of any bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains to give an accurate performance evaluation in addressing specific biological question. We used these tools to simulate a real-world genomic medicine question s involving the comparison of healthy and cancerous cells. Results revealed that performance in addressing a particular biological context varied significantly depending on the choice of tools and settings used. We also found that by combining the output of certain pipelines, substantial performance improvements could be achieved. CONCLUSION: Our research emphasizes the importance of selecting and configuring bioinformatic tools for the specific biological question being investigated to obtain optimal results. Pipeline designers, developers and users should include benchmarking in the context of their biological question as part of their design and quality control process. Our SimBA suite of benchmarking tools provides a reliable basis for comparing the performance of RNA-Seq bioinformatics pipelines in addressing a specific biological question. We would like to see the creation of a reference corpus of data-sets that would allow accurate comparison between benchmarks performed by different groups and the publication of more benchmarks based on this public corpus. SimBA software and data-set are available at http://cractools.gforge.inria.fr/softwares/simba/ .


Assuntos
Biologia Computacional/métodos , Simulação por Computador , Análise de Sequência de RNA/métodos , Software , Fusão Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Mutação INDEL/genética , Polimorfismo de Nucleotídeo Único/genética
10.
F1000Res ; 62017.
Artigo em Inglês | MEDLINE | ID: mdl-29623188

RESUMO

Background: High-throughput next generation sequencing (NGS) technologies enable the detection of biomarkers used for tumor classification, disease monitoring and cancer therapy. Whole-transcriptome analysis using RNA-seq is important, not only as a means of understanding the mechanisms responsible for complex diseases but also to efficiently identify novel genes/exons, splice isoforms, RNA editing, allele-specific mutations, differential gene expression and fusion-transcripts or chimeric RNA (chRNA). Methods: We used Crac, a tool that uses genomic locations and local coverage to classify biological events and directly infer splice and chimeric junctions within a single read. Crac's algorithm extracts transcriptional chimeric events irrespective of annotation with a high sensitivity, and CracTools was used to aggregate, annotate and filter the chRNA reads. The selected chRNA candidates were validated by real time PCR and sequencing.  In order to check the tumor specific expression of chRNA, we analyzed a publicly available dataset using a new tag search approach. Results:  We present data related to acute myeloid leukemia (AML) RNA-seq analysis. We highlight novel biological cases of chRNA, in addition to previously well characterized leukemia chRNA. We have identified and validated 17 chRNAs among 3 AML patients: 10 from an AML patient with a translocation between chromosomes 15 and 17 (AML-t(15;17), 4  from patient with normal karyotype (AML-NK) 3 from a patient with chromosomal 16 inversion (AML-inv16). The new fusion transcripts can be classified into four groups according to the exon organization. Conclusions:  All groups suggest complex but distinct synthesis mechanisms involving either collinear exons of different genes, non-collinear exons, or exons of different chromosomes. Finally, we check tumor-specific expression in a larger RNA-seq AML cohort and identify new AML biomarkers that could improve diagnosis and prognosis of AML.

11.
BioData Min ; 9: 34, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27822312

RESUMO

BACKGROUND: High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. RESULTS: The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases. Moreover, if we succeed to have a proper set of observations (enough sequencing data) about true chRNAs, chances are that the devised model can not be able to generalize beyond it. Like any other machine learning problem, the first big issue is finding the good data to build models. As far as we were concerned, there is no common benchmark data available for chRNAs detection. The definition of a classification baseline is lacking in the related literature too. In this work we are moving towards benchmark data and an evaluation of the fidelity of supervised classifiers in the prediction of chRNAs. CONCLUSIONS: We proposed a modelization strategy that can be used to increase the tools performances in context of chRNA classification based on a simulated data generator, that permit to continuously integrate new complex chimeric events. The pipeline incorporated a genome mutation process and simulated RNA-seq data. The reads within distinct depth were aligned and analysed by CRAC that integrates genomic location and local coverage, allowing biological predictions at the read scale. Additionally, these reads were functionally annotated and aggregated to form chRNAs events, making it possible to evaluate ML methods (classifiers) performance in both levels of reads and events. Ensemble learning strategies demonstrated to be more robust to this classification problem, providing an average AUC performance of 95 % (ACC=94 %, Kappa=0.87 %). The resulting classification models were also tested on real RNA-seq data from a set of twenty-seven patients with acute myeloid leukemia (AML).

12.
Hum Reprod Update ; 23(1): 19-40, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27655590

RESUMO

BACKGROUND: Human long non-coding RNAs (lncRNAs) are an emerging category of transcripts with increasingly documented functional roles during development. LncRNAs and roles during human early embryo development have recently begun to be unravelled. OBJECTIVE AND RATIONALE: This review summarizes the most recent knowledge on lncRNAs and focuses on their expression patterns and role during early human embryo development and in pluripotent stem cells (PSCs). Public mRNA sequencing (mRNA-seq) data were used to illustrate these expression signatures. SEARCH METHODS: The PubMed and EMBASE databases were first interrogated using specific terms, such as 'lncRNAs', to get an extensive overview on lncRNAs up to February 2016, and then using 'human lncRNAs' and 'embryo', 'development', or 'PSCs' to focus on lncRNAs involved in human embryo development or in PSC.Recently published RNA-seq data from human oocytes and pre-implantation embryos (including single-cell data), PSC and a panel of normal and malignant adult tissues were used to describe the specific expression patterns of some lncRNAs in early human embryos. OUTCOMES: The existence and the crucial role of lncRNAs in many important biological phenomena in each branch of the life tree are now well documented. The number of identified lncRNAs is rapidly increasing and has already outnumbered that of protein-coding genes. Unlike small non-coding RNAs, a variety of mechanisms of action have been proposed for lncRNAs. The functional role of lncRNAs has been demonstrated in many biological and developmental processes, including cell pluripotency induction, X-inactivation or gene imprinting. Analysis of RNA-seq data highlights that lncRNA abundance changes significantly during human early embryonic development. This suggests that lncRNAs could represent candidate biomarkers for developing non-invasive tests for oocyte or embryo quality. Finally, some of these lncRNAs are also expressed in human cancer tissues, suggesting that reactivation of an embryonic lncRNA program may contribute to human malignancies. WIDER IMPLICATIONS: LncRNAs are emerging potential key players in gene expression regulation. Analysis of RNA-seq data from human pre-implantation embryos identified lncRNA signatures that are specific to this critical step. We anticipate that further studies will show that these new transcripts are major regulators of embryo development. These findings might also be used to develop new tests/treatments for improving the pregnancy success rate in IVF procedures or for regenerative medicine applications involving PSC.


Assuntos
Desenvolvimento Embrionário/genética , Regulação da Expressão Gênica , RNA Longo não Codificante/metabolismo , Humanos , Neoplasias/genética , Inativação do Cromossomo X
13.
Nat Commun ; 7: 10767, 2016 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-26908133

RESUMO

The cytidine analogues azacytidine and 5-aza-2'-deoxycytidine (decitabine) are commonly used to treat myelodysplastic syndromes, with or without a myeloproliferative component. It remains unclear whether the response to these hypomethylating agents results from a cytotoxic or an epigenetic effect. In this study, we address this question in chronic myelomonocytic leukaemia. We describe a comprehensive analysis of the mutational landscape of these tumours, combining whole-exome and whole-genome sequencing. We identify an average of 14±5 somatic mutations in coding sequences of sorted monocyte DNA and the signatures of three mutational processes. Serial sequencing demonstrates that the response to hypomethylating agents is associated with changes in DNA methylation and gene expression, without any decrease in the mutation allele burden, nor prevention of new genetic alteration occurence. Our findings indicate that cytosine analogues restore a balanced haematopoiesis without decreasing the size of the mutated clone, arguing for a predominantly epigenetic effect.


Assuntos
Antimetabólitos Antineoplásicos/farmacologia , Azacitidina/análogos & derivados , Azacitidina/farmacologia , Sobrevivência Celular/efeitos dos fármacos , Metilação de DNA/efeitos dos fármacos , Epigênese Genética/efeitos dos fármacos , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Leucemia Mielomonocítica Crônica/genética , Mutação , Idoso , Idoso de 80 Anos ou mais , Alelos , Antimetabólitos Antineoplásicos/uso terapêutico , Azacitidina/uso terapêutico , Decitabina , Feminino , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Leucemia Mielomonocítica Crônica/tratamento farmacológico , Masculino , Pessoa de Meia-Idade , Análise de Sequência de DNA , Análise de Sequência de RNA
14.
Biomed Res Int ; 2014: 423174, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24883311

RESUMO

Despite the improvement in treatment options, chronic lymphocytic leukemia (CLL) remains an incurable disease and patients show a heterogeneous clinical course requiring therapy for many of them. In the current work, we have built a 20-gene expression (GE)-based risk score predictive for patients overall survival and improving risk classification using microarray gene expression data. GE-based risk score allowed identifying a high-risk group associated with a significant shorter overall survival (OS) and time to treatment (TTT) (P ≤ .01), comprising 19.6% and 13.6% of the patients in two independent cohorts. GE-based risk score, and NRIP1 and TCF7 gene expression remained independent prognostic factors using multivariate Cox analyses and combination of GE-based risk score together with NRIP1 and TCF7 gene expression enabled the identification of three clinically distinct groups of CLL patients. Therefore, this GE-based risk score represents a powerful tool for risk stratification and outcome prediction of CLL patients and could thus be used to guide clinical and therapeutic decisions prospectively.


Assuntos
Regulação Neoplásica da Expressão Gênica , Leucemia Linfocítica Crônica de Células B/genética , Proteínas de Neoplasias/biossíntese , Prognóstico , Proteínas Adaptadoras de Transdução de Sinal/biossíntese , Humanos , Leucemia Linfocítica Crônica de Células B/patologia , Análise em Microsséries , Proteínas Nucleares/biossíntese , Proteína 1 de Interação com Receptor Nuclear , Análise de Sobrevida , Fator 1 de Transcrição de Linfócitos T/biossíntese , Resultado do Tratamento
15.
Nucleic Acids Res ; 42(5): 2820-32, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24357408

RESUMO

Recent sequencing technologies that allow massive parallel production of short reads are the method of choice for transcriptome analysis. Particularly, digital gene expression (DGE) technologies produce a large dynamic range of expression data by generating short tag signatures for each cell transcript. These tags can be mapped back to a reference genome to identify new transcribed regions that can be further covered by RNA-sequencing (RNA-Seq) reads. Here, we applied an integrated bioinformatics approach that combines DGE tags, RNA-Seq, tiling array expression data and species-comparison to explore new transcriptional regions and their specific biological features, particularly tissue expression or conservation. We analysed tags from a large DGE data set (designated as 'TranscriRef'). We then annotated 750,000 tags that were uniquely mapped to the human genome according to Ensembl. We retained transcripts originating from both DNA strands and categorized tags corresponding to protein-coding genes, antisense, intronic- or intergenic-transcribed regions and computed their overlap with annotated non-coding transcripts. Using this bioinformatics approach, we identified ∼34,000 novel transcribed regions located outside the boundaries of known protein-coding genes. As demonstrated using sequencing data from human pluripotent stem cells for biological validation, the method could be easily applied for the selection of tissue-specific candidate transcripts. DigitagCT is available at http://cractools.gforge.inria.fr/softwares/digitagct.


Assuntos
Perfilação da Expressão Gênica/métodos , Genoma Humano , RNA não Traduzido/análise , Análise de Sequência de RNA/métodos , Linhagem Celular , Humanos , Anotação de Sequência Molecular , Poli A/análise , Software , Transcrição Gênica
16.
Genome Biol ; 14(3): R30, 2013 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-23537109

RESUMO

A large number of RNA-sequencing studies set out to predict mutations, splice junctions or fusion RNAs. We propose a method, CRAC, that integrates genomic locations and local coverage to enable such predictions to be made directly from RNA-seq read analysis. A k-mer profiling approach detects candidate mutations, indels and splice or chimeric junctions in each single read. CRAC increases precision compared with existing tools, reaching 99:5% for splice junctions, without losing sensitivity. Importantly, CRAC predictions improve with read length. In cancer libraries, CRAC recovered 74% of validated fusion RNAs and predicted novel recurrent chimeric junctions. CRAC is available at http://crac.gforge.inria.fr.


Assuntos
Algoritmos , Análise de Sequência de RNA/métodos , Neoplasias da Mama/genética , Simulação por Computador , Feminino , Biblioteca Gênica , Genoma , Humanos , Sítios de Splice de RNA/genética
17.
Cancer Biol Ther ; 14(5): 401-10, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23377825

RESUMO

The N-myc downstream regulated gene 1 (NDRG1) has been identified as a metastasis-suppressor gene in prostate cancer (PCa). Compounds targeting PCa cells deficient in NDRG1 could potentially decrease invasion/metastasis of PCa. A cell based screening strategy was employed to identify small molecules that selectively target NDRG1 deficient PCa cells. DU-145 PCa cells rendered deficient in NDRG1 expression by a lentiviral shRNA-mediated knockdown strategy were used in the primary screen. Compounds filtered from the primary screen were further validated through proliferation and clonogenic survival assays in parental and NDRG1 knockdown PCa cells. Screening of 3360 compounds revealed irinotecan and cetrimonium bromide (CTAB) as compounds that exhibited synthetic lethality against NDRG1 deficient PCa cells. A three-dimensional (3-D) invasion assay was utilized to test the ability of CTAB to inhibit invasion of DU-145 cells. CTAB was found to remarkably decrease invasion of DU-145 cells in collagen matrix. Our results suggest that CTAB and irinotecan could be further explored for their potential clinical benefit in patients with NDRG1 deficient PCa.


Assuntos
Camptotecina/análogos & derivados , Proteínas de Ciclo Celular/deficiência , Compostos de Cetrimônio/farmacologia , Peptídeos e Proteínas de Sinalização Intracelular/deficiência , Neoplasias da Próstata/tratamento farmacológico , Neoplasias da Próstata/metabolismo , Antineoplásicos Fitogênicos/farmacologia , Camptotecina/farmacologia , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Processos de Crescimento Celular/efeitos dos fármacos , Linhagem Celular Tumoral , Cetrimônio , Técnicas de Silenciamento de Genes , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/genética , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Irinotecano , Masculino , Pessoa de Meia-Idade , Neoplasias da Próstata/genética , Neoplasias da Próstata/patologia , RNA Interferente Pequeno/administração & dosagem , RNA Interferente Pequeno/genética , Tensoativos/farmacologia
18.
Oncotarget ; 3(8): 824-32, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22910040

RESUMO

Patients with normal karyotype represent the single largest cytogenetic group of acute myeloid leukemia (AML), with highly heterogeneous clinical and molecular characteristics. In this study, we sought to determine new prognostic biomarkers in cytogenetically normal (CN)-AML patients. A gene expression (GE)-based risk score was built, summing up the prognostic value of 22 genes whose expression is associated with a bad prognosis in a training cohort of 163 patients. GE-based risk score allowed identifying a high-risk group of patients (53.4%) in two independent cohorts of CN-AML patients. GE-based risk score and EVI1 gene expression remained independent prognostic factors using multivariate Cox analyses. Combining GE-based risk score with EVI1 gene expression allowed the identification of three clinically different groups of patients in two independent cohorts of CN-AML patients. Thus, GE-based risk score is powerful to predict clinical outcome for CN-AML patients and may provide potential therapeutic advances.


Assuntos
Biomarcadores Tumorais/genética , Proteínas de Ligação a DNA/genética , Expressão Gênica , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/genética , Proto-Oncogenes/genética , Fatores de Transcrição/genética , Adulto , Análise Citogenética , Proteínas de Ligação a DNA/biossíntese , Intervalo Livre de Doença , Perfilação da Expressão Gênica , Humanos , Cariótipo , Proteína do Locus do Complexo MDS1 e EVI1 , Proteínas de Neoplasias/biossíntese , Proteínas de Neoplasias/genética , Prognóstico , Risco , Transativadores/biossíntese , Transativadores/genética , Fatores de Transcrição/biossíntese , Regulador Transcricional ERG , Proteínas Supressoras de Tumor/biossíntese , Proteínas Supressoras de Tumor/genética
19.
Br J Haematol ; 157(3): 347-56, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22390678

RESUMO

Chronic myelomonocytic leukaemia (CMML) is a heterogeneous haematopoietic disorder characterized by myeloproliferative or myelodysplastic features. At present, the pathogenesis of this malignancy is not completely understood. In this study, we sought to analyse gene expression profiles of CMML in order to characterize new molecular outcome predictors. A learning set of 32 untreated CMML patients at diagnosis was available for TaqMan low-density array gene expression analysis. From 93 selected genes related to cancer and cell cycle, we built a five-gene prognostic index after multiplicity correction. Using this index, we characterized two categories of patients with distinct overall survival (94% vs. 19% for good and poor overall survival, respectively; P = 0·007) and we successfully validated its strength on an independent cohort of 21 CMML patients with Affymetrix gene expression data. We found no specific patterns of association with traditional prognostic stratification parameters in the learning cohort. However, the poor survival group strongly correlated with high-risk treated patients and transformation to acute myeloid leukaemia. We report here a new multigene prognostic index for CMML, independent of the gene expression measurement method, which could be used as a powerful tool to predict clinical outcome and help physicians to evaluate criteria for treatments.


Assuntos
Biomarcadores Tumorais/metabolismo , Leucemia Mielomonocítica Crônica/diagnóstico , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Feminino , Seguimentos , Perfilação da Expressão Gênica/métodos , Humanos , Estimativa de Kaplan-Meier , Leucemia Mielomonocítica Crônica/terapia , Masculino , Pessoa de Meia-Idade , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reação em Cadeia da Polimerase/métodos , Prognóstico , RNA Neoplásico/genética , Resultado do Tratamento , Células U937
20.
BMC Bioinformatics ; 12: 242, 2011 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-21682852

RESUMO

BACKGROUND: High Throughput Sequencing (HTS) is now heavily exploited for genome (re-) sequencing, metagenomics, epigenomics, and transcriptomics and requires different, but computer intensive bioinformatic analyses. When a reference genome is available, mapping reads on it is the first step of this analysis. Read mapping programs owe their efficiency to the use of involved genome indexing data structures, like the Burrows-Wheeler transform. Recent solutions index both the genome, and the k-mers of the reads using hash-tables to further increase efficiency and accuracy. In various contexts (e.g. assembly or transcriptome analysis), read processing requires to determine the sub-collection of reads that are related to a given sequence, which is done by searching for some k-mers in the reads. Currently, many developments have focused on genome indexing structures for read mapping, but the question of read indexing remains broadly unexplored. However, the increase in sequence throughput urges for new algorithmic solutions to query large read collections efficiently. RESULTS: Here, we present a solution, named Gk arrays, to index large collections of reads, an algorithm to build the structure, and procedures to query it. Once constructed, the index structure is kept in main memory and is repeatedly accessed to answer queries like "given a k-mer, get the reads containing this k-mer (once/at least once)". We compared our structure to other solutions that adapt uncompressed indexing structures designed for long texts and show that it processes queries fast, while requiring much less memory. Our structure can thus handle larger read collections. We provide examples where such queries are adapted to different types of read analysis (SNP detection, assembly, RNA-Seq). CONCLUSIONS: Gk arrays constitute a versatile data structure that enables fast and more accurate read analysis in various contexts. The Gk arrays provide a flexible brick to design innovative programs that mine efficiently genomics, epigenomics, metagenomics, or transcriptomics reads. The Gk arrays library is available under Cecill (GPL compliant) license from http://www.atgc-montpellier.fr/ngs/.


Assuntos
Algoritmos , Biologia Computacional/métodos , Computadores , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...