Search | VHL Regional Portal

RNA-binding proteins that lack canonical RNA-binding domains are rarely sequence-specific.

Ray, Debashish; Laverty, Kaitlin U; Jolma, Arttu; Nie, Kate; Samson, Reuben; Pour, Sara E; Tam, Cyrus L; von Krosigk, Niklas; Nabeel-Shah, Syed; Albu, Mihai; Zheng, Hong; Perron, Gabrielle; Lee, Hyunmin; Najafabadi, Hamed; Blencowe, Benjamin; Greenblatt, Jack; Morris, Quaid; Hughes, Timothy R.

Sci Rep ; 13(1): 5238, 2023 03 31.

Article in English | MEDLINE | ID: mdl-37002329

ABSTRACT

Thousands of RNA-binding proteins (RBPs) crosslink to cellular mRNA. Among these are numerous unconventional RBPs (ucRBPs)-proteins that associate with RNA but lack known RNA-binding domains (RBDs). The vast majority of ucRBPs have uncharacterized RNA-binding specificities. We analyzed 492 human ucRBPs for intrinsic RNA-binding in vitro and identified 23 that bind specific RNA sequences. Most (17/23), including 8 ribosomal proteins, were previously associated with RNA-related function. We identified the RBDs responsible for sequence-specific RNA-binding for several of these 23 ucRBPs and surveyed whether corresponding domains from homologous proteins also display RNA sequence specificity. CCHC-zf domains from seven human proteins recognized specific RNA motifs, indicating that this is a major class of RBD. For Nudix, HABP4, TPR, RanBP2-zf, and L7Ae domains, however, only isolated members or closely related homologs yielded motifs, consistent with RNA-binding as a derived function. The lack of sequence specificity for most ucRBPs is striking, and we suggest that many may function analogously to chromatin factors, which often crosslink efficiently to cellular DNA, presumably via indirect recruitment. Finally, we show that ucRBPs tend to be highly abundant proteins and suggest their identification in RNA interactome capture studies could also result from weak nonspecific interactions with RNA.

Subject(s)

RNA-Binding Proteins , RNA , Humans , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , RNA/metabolism , Ribosomal Proteins/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA-Binding Motifs/genetics , Protein Binding , Myogenic Regulatory Factors/metabolism

PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins.

Laverty, Kaitlin U; Jolma, Arttu; Pour, Sara E; Zheng, Hong; Ray, Debashish; Morris, Quaid; Hughes, Timothy R.

Nucleic Acids Res ; 50(19): e111, 2022 10 28.

Article in English | MEDLINE | ID: mdl-36018788

ABSTRACT

Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences.

Subject(s)

Algorithms , RNA-Binding Proteins , Binding Sites , RNA-Binding Proteins/metabolism , Nucleotide Motifs , RNA/chemistry , Protein Binding

Binding specificities of human RNA-binding proteins toward structured and linear RNA sequences.

Jolma, Arttu; Zhang, Jilin; Mondragón, Estefania; Morgunova, Ekaterina; Kivioja, Teemu; Laverty, Kaitlin U; Yin, Yimeng; Zhu, Fangjie; Bourenkov, Gleb; Morris, Quaid; Hughes, Timothy R; Maher, Louis James; Taipale, Jussi.

Genome Res ; 30(7): 962-973, 2020 07.

Article in English | MEDLINE | ID: mdl-32703884

ABSTRACT

RNA-binding proteins (RBPs) regulate RNA metabolism at multiple levels by affecting splicing of nascent transcripts, RNA folding, base modification, transport, localization, translation, and stability. Despite their central role in RNA function, the RNA-binding specificities of most RBPs remain unknown or incompletely defined. To address this, we have assembled a genome-scale collection of RBPs and their RNA-binding domains (RBDs) and assessed their specificities using high-throughput RNA-SELEX (HTR-SELEX). Approximately 70% of RBPs for which we obtained a motif bound to short linear sequences, whereas â¼30% preferred structured motifs folding into stem-loops. We also found that many RBPs can bind to multiple distinctly different motifs. Analysis of the matches of the motifs in human genomic sequences suggested novel roles for many RBPs. We found that three cytoplasmic proteins-ZC3H12A, ZC3H12B, and ZC3H12C-bound to motifs resembling the splice donor sequence, suggesting that these proteins are involved in degradation of cytoplasmic viral and/or unspliced transcripts. Structural analysis revealed that the RNA motif was not bound by the conventional C3H1 RNA-binding domain of ZC3H12B. Instead, the RNA motif was bound by the ZC3H12B's PilT N terminus (PIN) RNase domain, revealing a potential mechanism by which unconventional RBDs containing active sites or molecule-binding pockets could interact with short, structured RNA molecules. Our collection containing 145 high-resolution binding specificity models for 86 RBPs is the largest systematic resource for the analysis of human RBPs and will greatly facilitate future analysis of the various biological roles of this important class of proteins.

Subject(s)

RNA-Binding Proteins/chemistry , RNA-Binding Proteins/metabolism , RNA/chemistry , RNA/metabolism , Base Sequence , Genome, Human , Humans , Nucleic Acid Conformation , Nucleotide Motifs , Protein Binding , Protein Domains , Protein Multimerization , Ribonucleases/chemistry , Ribonucleases/metabolism , SELEX Aptamer Technique

A physical and genetic map of Cannabis sativa identifies extensive rearrangements at the THC/CBD acid synthase loci.

Laverty, Kaitlin U; Stout, Jake M; Sullivan, Mitchell J; Shah, Hardik; Gill, Navdeep; Holbrook, Larry; Deikus, Gintaras; Sebra, Robert; Hughes, Timothy R; Page, Jonathan E; van Bakel, Harm.

Genome Res ; 29(1): 146-156, 2019 01.

Article in English | MEDLINE | ID: mdl-30409771

ABSTRACT

Cannabis sativa is widely cultivated for medicinal, food, industrial, and recreational use, but much remains unknown regarding its genetics, including the molecular determinants of cannabinoid content. Here, we describe a combined physical and genetic map derived from a cross between the drug-type strain Purple Kush and the hemp variety "Finola." The map reveals that cannabinoid biosynthesis genes are generally unlinked but that aromatic prenyltransferase (AP), which produces the substrate for THCA and CBDA synthases (THCAS and CBDAS), is tightly linked to a known marker for total cannabinoid content. We further identify the gene encoding CBCA synthase (CBCAS) and characterize its catalytic activity, providing insight into how cannabinoid diversity arises in cannabis. THCAS and CBDAS (which determine the drug vs. hemp chemotype) are contained within large (>250 kb) retrotransposon-rich regions that are highly nonhomologous between drug- and hemp-type alleles and are furthermore embedded within â¼40 Mb of minimally recombining repetitive DNA. The chromosome structures are similar to those in grains such as wheat, with recombination focused in gene-rich, repeat-depleted regions near chromosome ends. The physical and genetic map should facilitate further dissection of genetic and molecular mechanisms in this commercially and medically important plant.

Subject(s)

Cannabinoids , Cannabis , Chromosome Mapping , Chromosomes, Plant , Ligases , Plant Proteins , Cannabinoids/biosynthesis , Cannabinoids/genetics , Cannabis/genetics , Cannabis/metabolism , Chromosomes, Plant/genetics , Chromosomes, Plant/metabolism , Gene Rearrangement , Ligases/genetics , Ligases/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism

Motif models for RNA-binding proteins.

Sasse, Alexander; Laverty, Kaitlin U; Hughes, Timothy R; Morris, Quaid D.

Curr Opin Struct Biol ; 53: 115-123, 2018 12.

Article in English | MEDLINE | ID: mdl-30172081

ABSTRACT

Identifying the binding preferences of RNA-binding proteins (RBPs) is important in understanding their contribution to post-transcriptional regulation. Here, we review the current state-of-the art of RNA motif identification tools for RBPs. New in vivo and in vitro data sets provide sufficient statistical power to enable detection of relatively long and complex sequence and sequence-structure binding preferences, and recent computational methods are geared towards quantitative identification of these patterns. We classify methods by their motif model's representational power and describe the underlying considerations for RNA-protein interactions. All classical motif identification algorithms apply physically motivated architectures, consisting of a motif and an occupancy model, we call these explicit motif models. Recent methods, such as convolutional neural networks and support vector machines, abandon the classical architecture and implicitly model RNA binding without defining a motif model. Although they achieve high accuracy on held-out data they may be unsuitable to solve the ultimate goal of the field, using motifs trained on in vitro data to predict in vivo binding sites. For this task methods need to separate intrinsic binding preferences from cellular effects from protein and RNA concentrations, cooperativity, and competition. To tackle this problem, we advocate for the use of a `three-layer' architecture, consisting of motif model, occupancy model, and extrinsic factor model, which enables separation and adjustment to cellular conditions.

Subject(s)

Models, Molecular , RNA-Binding Proteins/chemistry , RNA/chemistry , Algorithms , Binding Sites , Computational Biology/methods , Molecular Conformation , Nucleic Acid Conformation , Nucleotide Motifs , Protein Binding

RNAcompete-S: Combined RNA sequence/structure preferences for RNA binding proteins derived from a single-step in vitro selection.

Cook, Kate B; Vembu, Shankar; Ha, Kevin C H; Zheng, Hong; Laverty, Kaitlin U; Hughes, Timothy R; Ray, Debashish; Morris, Quaid D.

Methods ; 126: 18-28, 2017 08 15.

Article in English | MEDLINE | ID: mdl-28651966

ABSTRACT

RNA-binding proteins recognize RNA sequences and structures, but there is currently no systematic and accurate method to derive large (>12base) motifs de novo that reflect a combination of intrinsic preference to both sequence and structure. To address this absence, we introduce RNAcompete-S, which couples a single-step competitive binding reaction with an excess of random RNA 40-mers to a custom computational pipeline for interrogation of the bound RNA sequences and derivation of SSMs (Sequence and Structure Models). RNAcompete-S confirms that HuR, QKI, and SRSF1 prefer binding sites that are single stranded, and recapitulates known 8-10bp sequence and structure preferences for Vts1p and RBMY. We also derive an 18-base long SSM for Drosophila SLBP, which to our knowledge has not been previously determined by selections from pure random sequence, and accurately discriminates human replication-dependent histone mRNAs. Thus, RNAcompete-S enables accurate identification of large, intrinsic sequence-structure specificities with a uniform assay.

Subject(s)

Base Sequence/genetics , High-Throughput Nucleotide Sequencing/methods , RNA-Binding Proteins/genetics , Humans , RNA-Binding Proteins/chemistry , Sequence Analysis, RNA/methods

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL