Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
Sci Rep ; 13(1): 5238, 2023 03 31.
Article in English | MEDLINE | ID: mdl-37002329

ABSTRACT

Thousands of RNA-binding proteins (RBPs) crosslink to cellular mRNA. Among these are numerous unconventional RBPs (ucRBPs)-proteins that associate with RNA but lack known RNA-binding domains (RBDs). The vast majority of ucRBPs have uncharacterized RNA-binding specificities. We analyzed 492 human ucRBPs for intrinsic RNA-binding in vitro and identified 23 that bind specific RNA sequences. Most (17/23), including 8 ribosomal proteins, were previously associated with RNA-related function. We identified the RBDs responsible for sequence-specific RNA-binding for several of these 23 ucRBPs and surveyed whether corresponding domains from homologous proteins also display RNA sequence specificity. CCHC-zf domains from seven human proteins recognized specific RNA motifs, indicating that this is a major class of RBD. For Nudix, HABP4, TPR, RanBP2-zf, and L7Ae domains, however, only isolated members or closely related homologs yielded motifs, consistent with RNA-binding as a derived function. The lack of sequence specificity for most ucRBPs is striking, and we suggest that many may function analogously to chromatin factors, which often crosslink efficiently to cellular DNA, presumably via indirect recruitment. Finally, we show that ucRBPs tend to be highly abundant proteins and suggest their identification in RNA interactome capture studies could also result from weak nonspecific interactions with RNA.


Subject(s)
RNA-Binding Proteins , RNA , Humans , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , RNA/metabolism , Ribosomal Proteins/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA-Binding Motifs/genetics , Protein Binding , Myogenic Regulatory Factors/metabolism
2.
Genetics ; 221(3)2022 07 04.
Article in English | MEDLINE | ID: mdl-35552404

ABSTRACT

Sequences derived from the Long INterspersed Element-1 (L1) family of retrotransposons occupy at least 17% of the human genome, with 67 distinct subfamilies representing successive waves of expansion and extinction in mammalian lineages. L1s contribute extensively to gene regulation, but their molecular history is difficult to trace, because most are present only as truncated and highly mutated fossils. Consequently, L1 entries in current databases of repeat sequences are composed mainly of short diagnostic subsequences, rather than full functional progenitor sequences for each subfamily. Here, we have coupled 2 levels of sequence reconstruction (at the level of whole genomes and L1 subfamilies) to reconstruct progenitor sequences for all human L1 subfamilies that are more functionally and phylogenetically plausible than existing models. Most of the reconstructed sequences are at or near the canonical length of L1s and encode uninterrupted ORFs with expected protein domains. We also show that the presence or absence of binding sites for KRAB-C2H2 Zinc Finger Proteins, even in ancient-reconstructed progenitor L1s, mirrors binding observed in human ChIP-exo experiments, thus extending the arms race and domestication model. RepeatMasker searches of the modern human genome suggest that the new models may be able to assign subfamily resolution identities to previously ambiguous L1 instances. The reconstructed L1 sequences will be useful for genome annotation and functional study of both L1 evolution and L1 contributions to host regulatory networks.


Subject(s)
Long Interspersed Nucleotide Elements , Retroelements , Animals , Evolution, Molecular , Genome, Human , Humans , Mammals/genetics , Open Reading Frames , Phylogeny , Repetitive Sequences, Nucleic Acid , Retroelements/genetics
3.
Medicina (Kaunas) ; 57(7)2021 Jul 17.
Article in English | MEDLINE | ID: mdl-34357006

ABSTRACT

We present the case of a 35-year-old woman who had a high-risk pulmonary embolism (according to ESC risk stratification for pulmonary embolism) after she had undergone a Caesarion section. Postoperatively, she presented with acute left lower limb pain, swelling and erythema. A diagnosis was made of deep vein thrombosis (DVT) of the ilio-femoral and popliteal veins. She was started on anticoagulant therapy, which proved to be inefficient, the patient developing a left calf and thigh oedema and shortness of breath. A CT scan revealed high-risk embolus located in the right atrium and through the tricuspid valve. The decision was made to refer her to a cardiovascular surgeon. During her preoperative evaluation, the patient became hemodynamically unstable and was rushed into the operating room, severely desaturated, bradycardic, without consciousness, with severe hypotension. On the basis of the severe state of the patient and the CT scan findings we performed an emergency pulmonary embolectomy, with the patient on cardio-pulmonary by-pass, without cross-clamping the aorta, using a modified Trendelenburg procedure. This case supports using open pulmonary embolectomy for patients with hemodynamic instability on the basis of clinical diagnosis.


Subject(s)
Pulmonary Embolism , Adult , Anticoagulants/therapeutic use , Embolectomy , Female , Humans , Pulmonary Embolism/diagnostic imaging , Pulmonary Embolism/surgery , Tomography, X-Ray Computed
4.
Pharmaceutics ; 13(5)2021 May 06.
Article in English | MEDLINE | ID: mdl-34066331

ABSTRACT

Colon cancer is the third most common cancer type worldwide and is highly dependent on DNA mutations that progressively appear and accumulate in the normal colon epithelium. Mutations in the TP53 gene appear in approximately half of these patients and have significant implications in disease progression and response to therapy. miR-125b-5p is a controversial microRNA with a dual role in cancer that has been reported to target specifically TP53 in colon adenocarcinomas. Our study investigated the differential therapeutic effect of miR-125b-5p replacement in colon cancer based on the TP53 mutation status of colon cancer cell lines. In TP53 mutated models, miR-125b-5p overexpression slows cancer cells' malignant behavior by inhibiting the invasion/migration and colony formation capacity via direct downregulation of mutated TP53. In TP53 wild type cells, the exogenous modulation of miR-125b-5p did not significantly affect the molecular and phenotypic profile. In conclusion, our data show that miR-125b-5p has an anti-cancer effect only in TP53 mutated colon cancer cells, explaining partially the dual behavior of this microRNA in malignant pathologies.

5.
Nat Genet ; 51(6): 981-989, 2019 06.
Article in English | MEDLINE | ID: mdl-31133749

ABSTRACT

Transcription factor (TF) binding specificities (motifs) are essential for the analysis of gene regulation. Accurate prediction of TF motifs is critical, because it is infeasible to assay all TFs in all sequenced eukaryotic genomes. There is ongoing controversy regarding the degree of motif diversification among related species that is, in part, because of uncertainty in motif prediction methods. Here we describe similarity regression, a significantly improved method for predicting motifs, which we use to update and expand the Cis-BP database. Similarity regression inherently quantifies TF motif evolution, and shows that previous claims of near-complete conservation of motifs between human and Drosophila are inflated, with nearly half of the motifs in each species absent from the other, largely due to extensive divergence in C2H2 zinc finger proteins. We conclude that diversification in DNA-binding motifs is pervasive, and present a new tool and updated resource to study TF diversity and gene regulation across eukaryotes.


Subject(s)
Base Sequence , Binding Sites , Evolution, Molecular , Transcription Factors/metabolism , Animals , Computational Biology/methods , Conserved Sequence , Databases, Genetic , Gene Expression Regulation , Humans , Nucleotide Motifs , Protein Binding
7.
Cell ; 172(4): 650-665, 2018 02 08.
Article in English | MEDLINE | ID: mdl-29425488

ABSTRACT

Transcription factors (TFs) recognize specific DNA sequences to control chromatin and transcription, forming a complex system that guides expression of the genome. Despite keen interest in understanding how TFs control gene expression, it remains challenging to determine how the precise genomic binding sites of TFs are specified and how TF binding ultimately relates to regulation of transcription. This review considers how TFs are identified and functionally characterized, principally through the lens of a catalog of over 1,600 likely human TFs and binding motifs for two-thirds of them. Major classes of human TFs differ markedly in their evolutionary trajectories and expression patterns, underscoring distinct functions. TFs likewise underlie many different aspects of human physiology, disease, and variation, highlighting the importance of continued effort to understand TF-mediated gene regulation.


Subject(s)
Evolution, Molecular , Gene Expression Regulation , Response Elements , Transcription Factors , Amino Acid Motifs , Humans , Transcription Factors/chemistry , Transcription Factors/classification , Transcription Factors/genetics , Transcription Factors/metabolism
8.
G3 (Bethesda) ; 8(1): 219-229, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29146583

ABSTRACT

KRAB C2H2 zinc finger proteins (KZNFs) are the largest and most diverse family of human transcription factors, likely due to diversifying selection driven by novel endogenous retroelements (EREs), but the vast majority lack binding motifs or functional data. Two recent studies analyzed a majority of the human KZNFs using either ChIP-seq (60 proteins) or ChIP-exo (221 proteins) in the same cell type (HEK293). The ChIP-exo paper did not describe binding motifs, however. Thirty-nine proteins are represented in both studies, enabling the systematic comparison of the data sets presented here. Typically, only a minority of peaks overlap, but the two studies nonetheless display significant similarity in ERE binding for 32/39, and yield highly similar DNA binding motifs for 23 and related motifs for 34 (MoSBAT similarity score >0.5 and >0.2, respectively). Thus, there is overall (albeit imperfect) agreement between the two studies. For the 242 proteins represented in at least one study, we selected a highest-confidence motif for each protein, utilizing several motif-derivation approaches, and evaluating motifs within and across data sets. Peaks for the majority (158) are enriched (96% with AUC >0.6 predicting peak vs. nonpeak) for a motif that is supported by the C2H2 "recognition code," consistent with intrinsic sequence specificity driving DNA binding in cells. An additional 63 yield motifs enriched in peaks, but not supported by the recognition code, which could reflect indirect binding. Altogether, these analyses validate both data sets, and provide a reference motif set with associated quality metrics.


Subject(s)
CYS2-HIS2 Zinc Fingers , Repressor Proteins/genetics , Retroelements , Base Sequence , Binding Sites , Chromatin Immunoprecipitation , Gene Expression , HEK293 Cells , High-Throughput Nucleotide Sequencing , Humans , Multigene Family , Protein Binding , Protein Isoforms/genetics , Protein Isoforms/metabolism , Repressor Proteins/metabolism
9.
Bioinformatics ; 32(22): 3504-3506, 2016 11 15.
Article in English | MEDLINE | ID: mdl-27466627

ABSTRACT

Measuring motif similarity is essential for identifying functionally related transcription factors (TFs) and RNA-binding proteins, and for annotating de novo motifs. Here, we describe Motif Similarity Based on Affinity of Targets (MoSBAT), an approach for measuring the similarity of motifs by computing their affinity profiles across a large number of random sequences. We show that MoSBAT successfully associates de novo ChIP-seq motifs with their respective TFs, accurately identifies motifs that are obtained from the same TF in different in vitro assays, and quantitatively reflects the similarity of in vitro binding preferences for pairs of TFs. AVAILABILITY AND IMPLEMENTATION: MoSBAT is available as a webserver at mosbat.ccbr.utoronto.ca, and for download at github.com/csglab/MoSBAT. CONTACT: t.hughes@utoronto.ca or hamed.najafabadi@mcgill.caSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
RNA-Binding Proteins/genetics , Sequence Analysis, Protein/methods , Transcription Factors/genetics , Binding Sites , Protein Binding , Sequence Alignment
10.
Bioinformatics ; 31(17): 2879-81, 2015 Sep 01.
Article in English | MEDLINE | ID: mdl-25953800

ABSTRACT

UNLABELLED: Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys2His2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail. AVAILABILITY AND IMPLEMENTATION: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: t.hughes@utoronto.ca.


Subject(s)
Carrier Proteins/metabolism , Chromatin Immunoprecipitation/methods , High-Throughput Nucleotide Sequencing/methods , Nuclear Proteins/metabolism , Nucleotide Motifs/genetics , Regulatory Sequences, Nucleic Acid , Transcription Factors/metabolism , Zinc Fingers/genetics , Algorithms , Binding Sites , Gene Expression Regulation , Genome, Human , Humans , Repressor Proteins , Retroelements/genetics , Sequence Analysis, DNA/methods
11.
Elife ; 42015 Apr 23.
Article in English | MEDLINE | ID: mdl-25905672

ABSTRACT

Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs.


Subject(s)
Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans/genetics , DNA, Helminth/genetics , Transcription Factors/genetics , Zinc Fingers/genetics , Amino Acid Sequence , Animals , Base Sequence , Binding Sites , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/chemistry , Caenorhabditis elegans Proteins/metabolism , DNA, Helminth/chemistry , DNA, Helminth/metabolism , Gene Expression Regulation , Gene Regulatory Networks , Molecular Sequence Data , Promoter Regions, Genetic , Protein Binding , Protein Interaction Domains and Motifs , Receptors, Cytoplasmic and Nuclear , Transcription Factors/chemistry , Transcription Factors/metabolism
12.
Nat Biotechnol ; 33(5): 555-62, 2015 May.
Article in English | MEDLINE | ID: mdl-25690854

ABSTRACT

Cys2-His2 zinc finger (C2H2-ZF) proteins represent the largest class of putative human transcription factors. However, for most C2H2-ZF proteins it is unknown whether they even bind DNA or, if they do, to which sequences. Here, by combining data from a modified bacterial one-hybrid system with protein-binding microarray and chromatin immunoprecipitation analyses, we show that natural C2H2-ZFs encoded in the human genome bind DNA both in vitro and in vivo, and we infer the DNA recognition code using DNA-binding data for thousands of natural C2H2-ZF domains. In vivo binding data are generally consistent with our recognition code and indicate that C2H2-ZF proteins recognize more motifs than all other human transcription factors combined. We provide direct evidence that most KRAB-containing C2H2-ZF proteins bind specific endogenous retroelements (EREs), ranging from currently active to ancient families. The majority of C2H2-ZF proteins, including KRAB proteins, also show widespread binding to regulatory regions, indicating that the human genome contains an extensive and largely unstudied adaptive C2H2-ZF regulatory network that targets a diverse range of genes and pathways.


Subject(s)
Carrier Proteins/metabolism , Genome, Human , Nuclear Proteins/metabolism , Repressor Proteins/metabolism , Retroelements/genetics , Carrier Proteins/genetics , Chromatin/metabolism , DNA-Binding Proteins/genetics , Gene Expression Regulation , Humans , Nuclear Proteins/genetics , Protein Binding , Regulatory Sequences, Nucleic Acid , Repressor Proteins/genetics
13.
Cell ; 158(6): 1431-1443, 2014 Sep 11.
Article in English | MEDLINE | ID: mdl-25215497

ABSTRACT

Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.


Subject(s)
Arabidopsis/genetics , Nucleotide Motifs , Sequence Analysis, DNA , Transcription Factors/metabolism , Arabidopsis/metabolism , Chromatin Immunoprecipitation , Humans , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Protein Binding , Quantitative Trait Loci
14.
Nature ; 499(7457): 172-7, 2013 Jul 11.
Article in English | MEDLINE | ID: mdl-23846655

ABSTRACT

RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.


Subject(s)
Gene Expression Regulation/genetics , Nucleotide Motifs/genetics , RNA-Binding Proteins/metabolism , Autistic Disorder/genetics , Base Sequence , Binding Sites/genetics , Conserved Sequence/genetics , Eukaryotic Cells/metabolism , Humans , Molecular Sequence Data , Protein Structure, Tertiary/genetics , RNA Splicing Factors , RNA Stability/genetics , RNA-Binding Proteins/chemistry , RNA-Binding Proteins/genetics
15.
Mol Ecol Resour ; 11(1): 84-8, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21429103

ABSTRACT

DNA barcoding is based on the use of short DNA sequences to provide taxonomic tags for rapid, efficient identification of biological specimens. Currently, reference databases are being compiled. In the future, it will be important to facilitate access to these databases, especially for nonspecialist users. The method described here provides a rapid, web-based, user-friendly link between the DNA sequence from an unidentified biological specimen and various types of biological information, including the species name. Specifically, we use a customized, Google-type search algorithm to quickly match an unknown DNA sequence to a list of verified DNA barcodes in the reference database. In addition to retrieving the species name, our web tool also provides automatic links to a range of other information about that species. As the DNA barcode database becomes more populated, it will become increasingly important for the broader user community to be able to exploit it for the rapid identification of unknown specimens and to easily obtain relevant biological information about these species. The application presented here meets that need.


Subject(s)
DNA Barcoding, Taxonomic , Sequence Analysis, DNA/methods , Algorithms , DNA/genetics , Databases, Genetic , Internet , Molecular Sequence Data , Sequence Analysis, DNA/instrumentation , Software
16.
Genome Biol Evol ; 1: 288-93, 2009 Aug 04.
Article in English | MEDLINE | ID: mdl-20333198

ABSTRACT

The availability of complete genome sequences for 12 Drosophila species provides an unprecedented resource for large-scale studies of genome evolution. In this study, we looked for correlated shifts in the patterns of genome and proteome evolution within the genus Drosophila. Specifically, we asked if the nucleotide composition of the Drosophila willistoni genome--which is significantly less GC rich than the other 11 sequenced Drosophila genomes--is reflected in an altered pattern of amino acid substitutions in the encoded proteins. Our results show that this is indeed the case: There are large and highly significant asymmetries in the patterns of amino acid substitution between D. willistoni and Drosophila melanogaster, and they are in the direction predicted by the nucleotide biases. The implication of this result, combined with previous studies on long-term proteome evolution, is that substitutional biases at the DNA level can be a major factor in determining both the long-term and the short-term directions of proteome evolution.

17.
Mol Biol Evol ; 25(12): 2521-4, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18842686

ABSTRACT

The relative rates of nucleotide substitution at synonymous and nonsynonymous sites within protein-coding regions have been widely used to infer the action of natural selection from comparative sequence data. It is known, however, that mutational and repair biases can affect rates of evolution at both synonymous and nonsynonymous sites. More importantly, it is also known that synonymous sites are particularly prone to the effects of nucleotide bias. This means that nucleotide biases may affect the calculated ratio of substitution rates at synonymous and nonsynonymous sites. Using a large data set of animal mitochondrial sequences, we demonstrate that this is, in fact, the case. Highly biased nucleotide sequences are characterized by significantly elevated dN/dS ratios, but only when the nucleotide frequencies are not taken into account. When the analysis is repeated taking the nucleotide frequencies at each codon position into account, such elevated ratios disappear. These results suggest that the recently reported differences in dN/dS ratios between vertebrate and invertebrate mitochondrial sequences could be explained by variations in mitochondrial nucleotide frequencies rather than the effects of positive Darwinian selection.


Subject(s)
DNA, Mitochondrial/genetics , Evolution, Molecular , Nucleotides/genetics , Selection, Genetic , Animals , Base Sequence
SELECTION OF CITATIONS
SEARCH DETAIL
...