Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Nat Commun ; 15(1): 4110, 2024 May 15.
Article in English | MEDLINE | ID: mdl-38750024

ABSTRACT

Maturation of eukaryotic pre-mRNAs via splicing and polyadenylation is modulated across cell types and conditions by a variety of RNA-binding proteins (RBPs). Although there exist over 1,500 RBPs in human cells, their binding motifs and functions still remain to be elucidated, especially in the complex environment of tissues and in the context of diseases. To overcome the lack of methods for the systematic and automated detection of sequence motif-guided pre-mRNA processing regulation from RNA sequencing (RNA-Seq) data we have developed MAPP (Motif Activity on Pre-mRNA Processing). Applying MAPP to RBP knock-down experiments reveals that many RBPs regulate both splicing and polyadenylation of nascent transcripts by acting on similar sequence motifs. MAPP not only infers these sequence motifs, but also unravels the position-dependent impact of the RBPs on pre-mRNA processing. Interestingly, all investigated RBPs that act on both splicing and 3' end processing exhibit a consistently repressive or activating effect on both processes, providing a first glimpse on the underlying mechanism. Applying MAPP to normal and malignant brain tissue samples unveils that the motifs bound by the PTBP1 and RBFOX RBPs coordinately drive the oncogenic splicing program active in glioblastomas demonstrating that MAPP paves the way for characterizing pre-mRNA processing regulators under physiological and pathological conditions.


Subject(s)
Polyadenylation , RNA Precursors , RNA Splicing , RNA-Binding Proteins , Humans , RNA-Binding Proteins/metabolism , RNA-Binding Proteins/genetics , RNA Precursors/metabolism , RNA Precursors/genetics , Gene Expression Regulation, Neoplastic , Neoplasms/genetics , Neoplasms/metabolism , Nucleotide Motifs , Polypyrimidine Tract-Binding Protein/metabolism , Polypyrimidine Tract-Binding Protein/genetics , RNA Splicing Factors/metabolism , RNA Splicing Factors/genetics , Heterogeneous-Nuclear Ribonucleoproteins/metabolism , Heterogeneous-Nuclear Ribonucleoproteins/genetics , RNA, Messenger/metabolism , RNA, Messenger/genetics
2.
Cell Genom ; 4(3): 100511, 2024 Mar 13.
Article in English | MEDLINE | ID: mdl-38428419

ABSTRACT

The development of cancer is an evolutionary process involving the sequential acquisition of genetic alterations that disrupt normal biological processes, enabling tumor cells to rapidly proliferate and eventually invade and metastasize to other tissues. We investigated the genomic evolution of prostate cancer through the application of three separate classification methods, each designed to investigate a different aspect of tumor evolution. Integrating the results revealed the existence of two distinct types of prostate cancer that arise from divergent evolutionary trajectories, designated as the Canonical and Alternative evolutionary disease types. We therefore propose the evotype model for prostate cancer evolution wherein Alternative-evotype tumors diverge from those of the Canonical-evotype through the stochastic accumulation of genetic alterations associated with disruptions to androgen receptor DNA binding. Our model unifies many previous molecular observations, providing a powerful new framework to investigate prostate cancer disease progression.


Subject(s)
Prostatic Neoplasms , Male , Humans , Prostatic Neoplasms/genetics , Prostate/metabolism , Mutation , Genomics , Evolution, Molecular
3.
Nucleic Acids Res ; 52(D1): D1018-D1023, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37850641

ABSTRACT

The usage of alternative terminal exons results in messenger RNA (mRNA) isoforms that differ in their 3' untranslated regions (3' UTRs) and often also in their protein-coding sequences. Alternative 3' UTRs contain different sets of cis-regulatory elements known to regulate mRNA stability, translation and localization, all of which are vital to cell identity and function. In previous work, we revealed that ∼25 percent of the experimentally observed RNA 3' ends are located within regions currently annotated as intronic, indicating that many 3' end isoforms remain to be uncovered. Also, the inclusion of not yet annotated terminal exons is more tissue specific compared to the already annotated ones. Here, we present the single cell-based Terminal Exon Annotation database (scTEA-db, www.scTEA-db.org) that provides the community with 12 063 so far not yet annotated terminal exons and associated transcript isoforms identified by analysing 53 069 publicly available single cell transcriptomes. Our scTEA-db web portal offers an array of features to find and explore novel terminal exons belonging to 5538 human genes, 110 of which are known cancer drivers. In summary, scTEA-db provides the foundation for studying the biological role of large numbers of so far not annotated terminal exon isoforms in cell identity and function.


Subject(s)
Alternative Splicing , Databases, Genetic , Gene Expression Profiling , Single-Cell Analysis , Humans , 3' Untranslated Regions/genetics , Base Sequence , Exons/genetics , Protein Isoforms/genetics , Transcriptome/genetics
4.
Cell Genom ; 2(11): None, 2022 Nov 09.
Article in English | MEDLINE | ID: mdl-36388765

ABSTRACT

Mutational signature analysis is commonly performed in cancer genomic studies. Here, we present SigProfilerExtractor, an automated tool for de novo extraction of mutational signatures, and benchmark it against another 13 bioinformatics tools by using 34 scenarios encompassing 2,500 simulated signatures found in 60,000 synthetic genomes and 20,000 synthetic exomes. For simulations with 5% noise, reflecting high-quality datasets, SigProfilerExtractor outperforms other approaches by elucidating between 20% and 50% more true-positive signatures while yielding 5-fold less false-positive signatures. Applying SigProfilerExtractor to 4,643 whole-genome- and 19,184 whole-exome-sequenced cancers reveals four novel signatures. Two of the signatures are confirmed in independent cohorts, and one of these signatures is associated with tobacco smoking. In summary, this report provides a reference tool for analysis of mutational signatures, a comprehensive benchmarking of bioinformatics tools for extracting signatures, and several novel mutational signatures, including one putatively attributed to direct tobacco smoking mutagenesis in bladder tissues.

5.
Viruses ; 14(7)2022 06 29.
Article in English | MEDLINE | ID: mdl-35891416

ABSTRACT

Viruses have evolved numerous mechanisms to exploit the molecular machinery of their host cells, including the broad spectrum of host RNA-binding proteins (RBPs). However, the RBP interactomes of most viruses are largely unknown. To shed light on the interaction landscape of RNA viruses with human host cell RBPs, we have analysed 197 single-stranded RNA (ssRNA) viral genome sequences and found that the majority of ssRNA virus genomes are significantly enriched or depleted in motifs for specific human RBPs, suggesting selection pressure on these interactions. To facilitate tailored investigations and the analysis of genomes sequenced in future, we have released our methodology as a fast and user-friendly computational toolbox named SMEAGOL. Our resources will contribute to future studies of specific ssRNA virus-host cell interactions and support the identification of antiviral drug targets.


Subject(s)
RNA Viruses , Viruses , Base Sequence , Genome, Viral , Humans , RNA , RNA Viruses/metabolism , RNA, Viral/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Viruses/genetics
6.
Viruses ; 14(5)2022 05 05.
Article in English | MEDLINE | ID: mdl-35632715

ABSTRACT

The International Virus Bioinformatics Meeting 2022 took place online, on 23-25 March 2022, and has attracted about 380 participants from all over the world. The goal of the meeting was to provide a meaningful and interactive scientific environment to promote discussion and collaboration and to inspire and suggest new research directions and questions. The participants created a highly interactive scientific environment even without physical face-to-face interactions. This meeting is a focal point to gain an insight into the state-of-the-art of the virus bioinformatics research landscape and to interact with researchers in the forefront as well as aspiring young scientists. The meeting featured eight invited and 18 contributed talks in eight sessions on three days, as well as 52 posters, which were presented during three virtual poster sessions. The main topics were: SARS-CoV-2, viral emergence and surveillance, virus-host interactions, viral sequence analysis, virus identification and annotation, phages, and viral diversity. This report summarizes the main research findings and highlights presented at the meeting.


Subject(s)
COVID-19 , Viruses, Unclassified , Viruses , Computational Biology , DNA Viruses , Humans , SARS-CoV-2
7.
Am J Hum Genet ; 109(5): 953-960, 2022 05 05.
Article in English | MEDLINE | ID: mdl-35460607

ABSTRACT

We report an autosomal recessive, multi-organ tumor predisposition syndrome, caused by bi-allelic loss-of-function germline variants in the base excision repair (BER) gene MBD4. We identified five individuals with bi-allelic MBD4 variants within four families and these individuals had a personal and/or family history of adenomatous colorectal polyposis, acute myeloid leukemia, and uveal melanoma. MBD4 encodes a glycosylase involved in repair of G:T mismatches resulting from deamination of 5'-methylcytosine. The colorectal adenomas from MBD4-deficient individuals showed a mutator phenotype attributable to mutational signature SBS1, consistent with the function of MBD4. MBD4-deficient polyps harbored somatic mutations in similar driver genes to sporadic colorectal tumors, although AMER1 mutations were more common and KRAS mutations less frequent. Our findings expand the role of BER deficiencies in tumor predisposition. Inclusion of MBD4 in genetic testing for polyposis and multi-tumor phenotypes is warranted to improve disease management.


Subject(s)
Adenomatous Polyposis Coli , Colorectal Neoplasms , Uveal Neoplasms , Adenomatous Polyposis Coli/genetics , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Endodeoxyribonucleases/genetics , Genetic Predisposition to Disease , Germ Cells/pathology , Germ-Line Mutation/genetics , Humans , Uveal Neoplasms/genetics
8.
Nat Commun ; 12(1): 6946, 2021 11 26.
Article in English | MEDLINE | ID: mdl-34836952

ABSTRACT

Black women across the African diaspora experience more aggressive breast cancer with higher mortality rates than white women of European ancestry. Although inter-ethnic germline variation is known, differential somatic evolution has not been investigated in detail. Analysis of deep whole genomes of 97 breast cancers, with RNA-seq in a subset, from women in Nigeria in comparison with The Cancer Genome Atlas (n = 76) reveal a higher rate of genomic instability and increased intra-tumoral heterogeneity as well as a unique genomic subtype defined by early clonal GATA3 mutations with a 10.5-year younger age at diagnosis. We also find non-coding mutations in bona fide drivers (ZNF217 and SYPL1) and a previously unreported INDEL signature strongly associated with African ancestry proportion, underscoring the need to expand inclusion of diverse populations in biomedical research. Finally, we demonstrate that characterizing tumors for homologous recombination deficiency has significant clinical relevance in stratifying patients for potentially life-saving therapies.


Subject(s)
Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Clonal Evolution , Health Status Disparities , Adult , Aged , Biopsy , Black People/ethnology , Black People/genetics , Breast/pathology , Breast Neoplasms/ethnology , Breast Neoplasms/mortality , Breast Neoplasms/pathology , DNA Mutational Analysis , Female , GATA3 Transcription Factor/genetics , Genetic Heterogeneity , Genomic Instability , Germ-Line Mutation , Humans , Middle Aged , Nigeria/epidemiology , Nigeria/ethnology , RNA-Seq , Risk Assessment , Synaptophysin/genetics , Trans-Activators/genetics , Tumor Microenvironment/genetics , White People/ethnology , White People/genetics , Whole Genome Sequencing
9.
Commun Biol ; 4(1): 590, 2021 05 17.
Article in English | MEDLINE | ID: mdl-34002013

ABSTRACT

The novel betacoronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a worldwide pandemic (COVID-19) after emerging in Wuhan, China. Here we analyzed public host and viral RNA sequencing data to better understand how SARS-CoV-2 interacts with human respiratory cells. We identified genes, isoforms and transposable element families that are specifically altered in SARS-CoV-2-infected respiratory cells. Well-known immunoregulatory genes including CSF2, IL32, IL-6 and SERPINA3 were differentially expressed, while immunoregulatory transposable element families were upregulated. We predicted conserved interactions between the SARS-CoV-2 genome and human RNA-binding proteins such as the heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1) and eukaryotic initiation factor 4 (eIF4b). We also identified a viral sequence variant with a statistically significant skew associated with age of infection, that may contribute to intracellular host-pathogen interactions. These findings can help identify host mechanisms that can be targeted by prophylactics and/or therapeutics to reduce the severity of COVID-19.


Subject(s)
COVID-19/genetics , Computational Biology/methods , Host-Pathogen Interactions/genetics , Pandemics , SARS-CoV-2/genetics , Binding Sites , COVID-19/virology , Cytokines/genetics , Databases, Genetic , Gene Expression Regulation , Genome, Viral , Humans , RNA, Viral/genetics , RNA, Viral/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , RNA-Seq , Serpins/genetics , Signal Transduction/genetics , Transcriptome , Virus Replication/genetics
10.
Nucleic Acids Res ; 48(D1): D174-D179, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31617559

ABSTRACT

Generated by 3' end cleavage and polyadenylation at alternative polyadenylation (poly(A)) sites, alternative terminal exons account for much of the variation between human transcript isoforms. More than a dozen protocols have been developed so far for capturing and sequencing RNA 3' ends from a variety of cell types and species. In previous studies, we have used these data to uncover novel regulatory signals and cell type-specific isoforms. Here we present an update of the PolyASite (https://polyasite.unibas.ch) resource of poly(A) sites, constructed from publicly available human, mouse and worm 3' end sequencing datasets by enforcing uniform quality measures, including the flagging of putative internal priming sites. Through integrated processing of all data, we identified and clustered sites that are closely spaced and share polyadenylation signals, as these are likely the result of stochastic variations in processing. For each cluster, we identified the representative - most frequently processed - site and estimated the relative use in the transcriptome across all samples. We have established a modern web portal for efficient finding, exploration and export of data. Database generation is fully automated, greatly facilitating incorporation of new datasets and the updating of underlying genome resources.


Subject(s)
Databases, Nucleic Acid , Polyadenylation , Animals , Caenorhabditis elegans/genetics , Humans , Mice , Poly A/analysis , Sequence Analysis, RNA
12.
Nat Rev Genet ; 20(10): 599-614, 2019 10.
Article in English | MEDLINE | ID: mdl-31267064

ABSTRACT

Most human genes have multiple sites at which RNA 3' end cleavage and polyadenylation can occur, enabling the expression of distinct transcript isoforms under different conditions. Novel methods to sequence RNA 3' ends have generated comprehensive catalogues of polyadenylation (poly(A)) sites; their analysis using innovative computational methods has revealed how poly(A) site choice is regulated by core RNA 3' end processing factors, such as cleavage factor I and cleavage and polyadenylation specificity factor, as well as by other RNA-binding proteins, particularly splicing factors. Here, we review the experimental and computational methods that have enabled the global mapping of mRNA and of long non-coding RNA 3' ends, quantification of the resulting isoforms and the discovery of regulators of alternative cleavage and polyadenylation (APA). We highlight the different types of APA-derived isoforms and their functional differences, and illustrate how APA contributes to human diseases, including cancer and haematological, immunological and neurological diseases.


Subject(s)
Disease/genetics , Polyadenylation/genetics , 3' Untranslated Regions/genetics , Animals , Health , Humans , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , RNA-Binding Proteins/genetics
13.
Nat Methods ; 15(10): 832-836, 2018 10.
Article in English | MEDLINE | ID: mdl-30202060

ABSTRACT

Sequencing of RNA 3' ends has uncovered numerous sites that do not correspond to the termination sites of known transcripts. Through their 3' untranslated regions, protein-coding RNAs interact with RNA-binding proteins and microRNAs, which regulate many properties, including RNA stability and subcellular localization. We developed the terminal exon characterization (TEC) tool ( http://tectool.unibas.ch ), which can be used with RNA-sequencing data from any species for which a genome annotation that includes sites of RNA cleavage and polyadenylation is available. We discovered hundreds of previously unknown isoforms and cell-type-specific terminal exons in human cells. Ribosome profiling data revealed that many of these isoforms were translated. By applying TECtool to single-cell sequencing data, we found that the newly identified isoforms were expressed in subpopulations of cells. Thus, TECtool enables the identification of previously unknown isoforms in well-studied cell systems and in rare cell types.


Subject(s)
Alternative Splicing , Computational Biology/methods , Exons/genetics , High-Throughput Nucleotide Sequencing/methods , RNA, Messenger/genetics , Software , Gene Expression Profiling , Humans , Polyadenylation , Protein Isoforms , RNA, Messenger/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Sequence Analysis, RNA , Tissue Distribution
14.
Mol Syst Biol ; 14(8): e8266, 2018 08 27.
Article in English | MEDLINE | ID: mdl-30150282

ABSTRACT

miRNAs are small RNAs that regulate gene expression post-transcriptionally. By repressing the translation and promoting the degradation of target mRNAs, miRNAs may reduce the cell-to-cell variability in protein expression, induce correlations between target expression levels, and provide a layer through which targets can influence each other's expression as "competing RNAs" (ceRNAs). However, experimental evidence for these behaviors is limited. Combining mathematical modeling with RNA sequencing of individual human embryonic kidney cells in which the expression of two distinct miRNAs was induced over a wide range, we have inferred parameters describing the response of hundreds of miRNA targets to miRNA induction. Individual targets have widely different response dynamics, and only a small proportion of predicted targets exhibit high sensitivity to miRNA induction. Our data reveal for the first time the response parameters of the entire network of endogenous miRNA targets to miRNA induction, demonstrating that miRNAs correlate target expression and at the same time increase the variability in expression of individual targets across cells. The approach is generalizable to other miRNAs and post-transcriptional regulators to improve the understanding of gene expression dynamics in individual cell types.


Subject(s)
Gene Regulatory Networks/genetics , MicroRNAs/genetics , RNA, Messenger/genetics , Single-Cell Analysis , Computational Biology , Gene Expression Profiling , Gene Expression Regulation/genetics , HEK293 Cells , Humans , Models, Theoretical , Sequence Analysis, RNA
15.
Genome Biol ; 19(1): 44, 2018 03 28.
Article in English | MEDLINE | ID: mdl-29592812

ABSTRACT

3' Untranslated regions (3' UTRs) length is regulated in relation to cellular state. To uncover key regulators of poly(A) site use in specific conditions, we have developed PAQR, a method for quantifying poly(A) site use from RNA sequencing data and KAPAC, an approach that infers activities of oligomeric sequence motifs on poly(A) site choice. Application of PAQR and KAPAC to RNA sequencing data from normal and tumor tissue samples uncovers motifs that can explain changes in cleavage and polyadenylation in specific cancers. In particular, our analysis points to polypyrimidine tract binding protein 1 as a regulator of poly(A) site choice in glioblastoma.


Subject(s)
3' Untranslated Regions , Polyadenylation , Sequence Analysis, RNA , Glioblastoma/genetics , Glioblastoma/metabolism , Humans , Male , Nucleotide Motifs , Polypyrimidine Tract-Binding Protein/metabolism , Prostatic Neoplasms/genetics , Prostatic Neoplasms/metabolism , RNA-Binding Proteins/metabolism , mRNA Cleavage and Polyadenylation Factors/metabolism
16.
J Vis Exp ; (128)2017 10 10.
Article in English | MEDLINE | ID: mdl-29053696

ABSTRACT

Studies in the last decade have revealed a complex and dynamic variety of pre-mRNA cleavage and polyadenylation reactions. mRNAs with long 3' untranslated regions (UTRs) are generated in differentiated cells whereas proliferating cells preferentially express transcripts with short 3'UTRs. We describe the A-seq protocol, now at its second version, which was developed to map polyadenylation sites genome-wide and study the regulation of pre-mRNA 3' end processing. Also this current protocol takes advantage of the polyadenylate (poly(A)) tails that are added during the biogenesis of most mammalian mRNAs to enrich for fully processed mRNAs. A DNA adaptor with deoxyuracil at its fourth position allows the precise processing of mRNA 3' end fragments for sequencing. Not including the cell culture and the overnight ligations, the protocol requires about 8 h hands-on time. Along with it, an easy-to-use software package for the analysis of the derived sequencing data is provided. A-seq2 and the associated analysis software provide an efficient and reliable solution to the mapping of pre-mRNA 3' ends in a wide range of conditions, from 106 or fewer cells.


Subject(s)
3' Untranslated Regions/genetics , Gene Library , Animals , Polyadenylation
17.
Biol Direct ; 12(1): 8, 2017 04 17.
Article in English | MEDLINE | ID: mdl-28412966

ABSTRACT

BACKGROUND: The transition between epithelial and mesenchymal phenotypes (EMT) occurs in a variety of contexts. It is critical for mammalian development and it is also involved in tumor initiation and progression. Master transcription factor (TF) regulators of this process are conserved between mouse and human. METHODS: From a computational analysis of a variety of high-throughput sequencing data sets we initially inferred that TFAP2A is connected to the core EMT network in both species. We then analysed publicly available human breast cancer data for TFAP2A expression and also studied the expression (by mRNA sequencing), activity (by monitoring the expression of its predicted targets), and binding (by electrophoretic mobility shift assay and chromatin immunoprecipitation) of this factor in a mouse mammary gland EMT model system (NMuMG) cell line. RESULTS: We found that upon induction of EMT, the activity of TFAP2A, reflected in the expression level of its predicted targets, is up-regulated in a variety of systems, both murine and human, while TFAP2A's expression is increased in more "stem-like" cancers. We provide strong evidence for the direct interaction between the TFAP2A TF and the ZEB2 promoter and we demonstrate that this interaction affects ZEB2 expression. Overexpression of TFAP2A from an exogenous construct perturbs EMT, however, in a manner similar to the downregulation of endogenous TFAP2A that takes place during EMT. CONCLUSIONS: Our study reveals that TFAP2A is a conserved component of the core network that regulates EMT, acting as a repressor of many genes, including ZEB2. REVIEWERS: This article has been reviewed by Dr. Martijn Huynen and Dr. Nicola Aceto.


Subject(s)
Epithelial-Mesenchymal Transition/genetics , Gene Expression Regulation, Neoplastic , Homeodomain Proteins/genetics , Repressor Proteins/genetics , Transcription Factor AP-2/metabolism , Zinc Finger E-box-Binding Homeobox 1/genetics , Animals , Cell Line , Female , Homeodomain Proteins/metabolism , Humans , Mammary Glands, Animal/metabolism , Mammary Glands, Animal/physiopathology , Mice , Repressor Proteins/metabolism , Transcription Factor AP-2/genetics , Transforming Growth Factor beta1/genetics , Transforming Growth Factor beta1/metabolism , Zinc Finger E-box Binding Homeobox 2 , Zinc Finger E-box-Binding Homeobox 1/metabolism
19.
Sci Rep ; 6: 34589, 2016 10 07.
Article in English | MEDLINE | ID: mdl-27713552

ABSTRACT

The unprecedented outbreak of Ebola in West Africa resulted in over 28,000 cases and 11,000 deaths, underlining the need for a better understanding of the biology of this highly pathogenic virus to develop specific counter strategies. Two filoviruses, the Ebola and Marburg viruses, result in a severe and often fatal infection in humans. However, bats are natural hosts and survive filovirus infections without obvious symptoms. The molecular basis of this striking difference in the response to filovirus infections is not well understood. We report a systematic overview of differentially expressed genes, activity motifs and pathways in human and bat cells infected with the Ebola and Marburg viruses, and we demonstrate that the replication of filoviruses is more rapid in human cells than in bat cells. We also found that the most strongly regulated genes upon filovirus infection are chemokine ligands and transcription factors. We observed a strong induction of the JAK/STAT pathway, of several genes encoding inhibitors of MAP kinases (DUSP genes) and of PPP1R15A, which is involved in ER stress-induced cell death. We used comparative transcriptomics to provide a data resource that can be used to identify cellular responses that might allow bats to survive filovirus infections.


Subject(s)
Ebolavirus/metabolism , Gene Expression Regulation , Hemorrhagic Fever, Ebola/metabolism , Marburg Virus Disease/metabolism , Marburgvirus/metabolism , Signal Transduction , Transcription, Genetic , Animals , Cell Line, Tumor , Chiroptera , Humans
20.
Genome Res ; 26(8): 1145-59, 2016 08.
Article in English | MEDLINE | ID: mdl-27382025

ABSTRACT

Alternative polyadenylation (APA) is a general mechanism of transcript diversification in mammals, which has been recently linked to proliferative states and cancer. Different 3' untranslated region (3' UTR) isoforms interact with different RNA-binding proteins (RBPs), which modify the stability, translation, and subcellular localization of the corresponding transcripts. Although the heterogeneity of pre-mRNA 3' end processing has been established with high-throughput approaches, the mechanisms that underlie systematic changes in 3' UTR lengths remain to be characterized. Through a uniform analysis of a large number of 3' end sequencing data sets, we have uncovered 18 signals, six of which are novel, whose positioning with respect to pre-mRNA cleavage sites indicates a role in pre-mRNA 3' end processing in both mouse and human. With 3' end sequencing we have demonstrated that the heterogeneous ribonucleoprotein C (HNRNPC), which binds the poly(U) motif whose frequency also peaks in the vicinity of polyadenylation (poly(A)) sites, has a genome-wide effect on poly(A) site usage. HNRNPC-regulated 3' UTRs are enriched in ELAV-like RBP 1 (ELAVL1) binding sites and include those of the CD47 gene, which participate in the recently discovered mechanism of 3' UTR-dependent protein localization (UDPL). Our study thus establishes an up-to-date, high-confidence catalog of 3' end processing sites and poly(A) signals, and it uncovers an important role of HNRNPC in regulating 3' end processing. It further suggests that U-rich elements mediate interactions with multiple RBPs that regulate different stages in a transcript's life cycle.


Subject(s)
Heterogeneous-Nuclear Ribonucleoprotein Group C/genetics , Polyadenylation/genetics , RNA-Binding Proteins/genetics , Transcription, Genetic , 3' Untranslated Regions/genetics , Animals , Binding Sites , Cytoplasm/genetics , Gene Expression , Humans , Mice , RNA, Messenger/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...