Pesquisa | Portal Regional da BVS

Merging short and stranded long reads improves transcript assembly.

Kainth, Amoldeep S; Haddad, Gabriela A; Hall, Johnathon M; Ruthenburg, Alexander J.

PLoS Comput Biol ; 19(10): e1011576, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37883581

RESUMO

Long-read RNA sequencing has arisen as a counterpart to short-read sequencing, with the potential to capture full-length isoforms, albeit at the cost of lower depth. Yet this potential is not fully realized due to inherent limitations of current long-read assembly methods and underdeveloped approaches to integrate short-read data. Here, we critically compare the existing methods and develop a new integrative approach to characterize a particularly challenging pool of low-abundance long noncoding RNA (lncRNA) transcripts from short- and long-read sequencing in two distinct cell lines. Our analysis reveals severe limitations in each of the sequencing platforms. For short-read assemblies, coverage declines at transcript termini resulting in ambiguous ends, and uneven low coverage results in segmentation of a single transcript into multiple transcripts. Conversely, long-read sequencing libraries lack depth and strand-of-origin information in cDNA-based methods, culminating in erroneous assembly and quantitation of transcripts. We also discover a cDNA synthesis artifact in long-read datasets that markedly impacts the identity and quantitation of assembled transcripts. Towards remediating these problems, we develop a computational pipeline to "strand" long-read cDNA libraries that rectifies inaccurate mapping and assembly of long-read transcripts. Leveraging the strengths of each platform and our computational stranding, we also present and benchmark a hybrid assembly approach that drastically increases the sensitivity and accuracy of full-length transcript assembly on the correct strand and improves detection of biological features of the transcriptome. When applied to a challenging set of under-annotated and cell-type variable lncRNA, our method resolves the segmentation problem of short-read sequencing and the depth problem of long-read sequencing, resulting in the assembly of coherent transcripts with precise 5' and 3' ends. Our workflow can be applied to existing datasets for superior demarcation of transcript ends and refined isoform structure, which can enable better differential gene expression analyses and molecular manipulations of transcripts.

Assuntos

RNA Longo não Codificante , DNA Complementar/genética , Análise de Sequência de RNA/métodos , Transcriptoma , Biblioteca Gênica , Isoformas de Proteínas/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos

Profiling lariat intermediates reveals genetic determinants of early and late co-transcriptional splicing.

Zeng, Yi; Fair, Benjamin J; Zeng, Huilin; Krishnamohan, Aiswarya; Hou, Yichen; Hall, Johnathon M; Ruthenburg, Alexander J; Li, Yang I; Staley, Jonathan P.

Mol Cell ; 82(24): 4681-4699.e8, 2022 12 15.

Artigo em Inglês | MEDLINE | ID: mdl-36435176

RESUMO

Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.

Assuntos

Processamento Alternativo , Splicing de RNA , Humanos , Sequência de Bases , Íntrons/genética , Éxons/genética

Chromatin-enriched RNAs mark active and repressive cis-regulation: An analysis of nuclear RNA-seq.

Sun, Xiangying; Wang, Zhezhen; Hall, Johnathon M; Perez-Cervantes, Carlos; Ruthenburg, Alexander J; Moskowitz, Ivan P; Gribskov, Michael; Yang, Xinan H.

PLoS Comput Biol ; 16(2): e1007119, 2020 02.

Artigo em Inglês | MEDLINE | ID: mdl-32040509

RESUMO

Long noncoding RNAs (lncRNAs) localize in the cell nucleus and influence gene expression through a variety of molecular mechanisms. Chromatin-enriched RNAs (cheRNAs) are a unique class of lncRNAs that are tightly bound to chromatin and putatively function to locally cis-activate gene transcription. CheRNAs can be identified by biochemical fractionation of nuclear RNA followed by RNA sequencing, but until now, a rigorous analytic pipeline for nuclear RNA-seq has been lacking. In this study, we survey four computational strategies for nuclear RNA-seq data analysis and develop a new pipeline, Tuxedo-ch, which outperforms other approaches. Tuxedo-ch assembles a more complete transcriptome and identifies cheRNA with higher accuracy than other approaches. We used Tuxedo-ch to analyze benchmark datasets of K562 cells and further characterize the genomic features of intergenic cheRNA (icheRNA) and their similarity to enhancer RNAs (eRNAs). We quantify the transcriptional correlation of icheRNA and adjacent genes and show that icheRNA is more positively associated with neighboring gene expression than eRNA or cap analysis of gene expression (CAGE) signals. We also explore two novel genomic associations of cheRNA, which indicate that cheRNAs may function to promote or repress gene expression in a context-dependent manner. IcheRNA loci with significant levels of H3K9me3 modifications are associated with active enhancers, consistent with the hypothesis that enhancers are derived from ancient mobile elements. In contrast, antisense cheRNA (as-cheRNA) may play a role in local gene repression, possibly through local RNA:DNA:DNA triple-helix formation.

Assuntos

Núcleo Celular/genética , Cromatina/metabolismo , Regulação da Expressão Gênica , RNA/genética , Análise de Sequência de RNA/métodos , Animais , Biologia Computacional , Elementos Facilitadores Genéticos , Humanos , RNA Mensageiro/genética

Transcription-factor-dependent enhancer transcription defines a gene regulatory network for cardiac rhythm.

Yang, Xinan H; Nadadur, Rangarajan D; Hilvering, Catharina Re; Bianchi, Valerio; Werner, Michael; Mazurek, Stefan R; Gadek, Margaret; Shen, Kaitlyn M; Goldman, Joseph Aaron; Tyan, Leonid; Bekeny, Jenna; Hall, Johnathon M; Lee, Nutishia; Perez-Cervantes, Carlos; Burnicka-Turek, Ozanna; Poss, Kenneth D; Weber, Christopher R; de Laat, Wouter; Ruthenburg, Alexander J; Moskowitz, Ivan P.

Elife ; 62017 12 27.

Artigo em Inglês | MEDLINE | ID: mdl-29280435

RESUMO

The noncoding genome is pervasively transcribed. Noncoding RNAs (ncRNAs) generated from enhancers have been proposed as a general facet of enhancer function and some have been shown to be required for enhancer activity. Here we examine the transcription-factor-(TF)-dependence of ncRNA expression to define enhancers and enhancer-associated ncRNAs that are involved in a TF-dependent regulatory network. TBX5, a cardiac TF, regulates a network of cardiac channel genes to maintain cardiac rhythm. We deep sequenced wildtype and Tbx5-mutant mouse atria, identifying ~2600 novel Tbx5-dependent ncRNAs. Tbx5-dependent ncRNAs were enriched for tissue-specific marks of active enhancers genome-wide. Tbx5-dependent ncRNAs emanated from regions that are enriched for TBX5-binding and that demonstrated Tbx5-dependent enhancer activity. Tbx5-dependent ncRNA transcription provided a quantitative metric of Tbx5-dependent enhancer activity, correlating with target gene expression. We identified RACER, a novel Tbx5-dependent long noncoding RNA (lncRNA) required for the expression of the calcium-handling gene Ryr2. We illustrate that TF-dependent enhancer transcription can illuminate components of TF-dependent gene regulatory networks.

Assuntos

Elementos Facilitadores Genéticos , Redes Reguladoras de Genes , RNA não Traduzido/biossíntese , Proteínas com Domínio T/metabolismo , Transcrição Gênica , Animais , Coração/fisiologia , Camundongos , Periodicidade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA