Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 3.971
Filter
1.
Nat Commun ; 15(1): 3699, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38698035

ABSTRACT

In silico identification of viral anti-CRISPR proteins (Acrs) has relied largely on the guilt-by-association method using known Acrs or anti-CRISPR associated proteins (Acas) as the bait. However, the low number and limited spread of the characterized archaeal Acrs and Aca hinders our ability to identify Acrs using guilt-by-association. Here, based on the observation that the few characterized archaeal Acrs and Aca are transcribed immediately post viral infection, we hypothesize that these genes, and many other unidentified anti-defense genes (ADG), are under the control of conserved regulatory sequences including a strong promoter, which can be used to predict anti-defense genes in archaeal viruses. Using this consensus sequence based method, we identify 354 potential ADGs in 57 archaeal viruses and 6 metagenome-assembled genomes. Experimental validation identified a CRISPR subtype I-A inhibitor and the first virally encoded inhibitor of an archaeal toxin-antitoxin based immune system. We also identify regulatory proteins potentially akin to Acas that can facilitate further identification of ADGs combined with the guilt-by-association approach. These results demonstrate the potential of regulatory sequence analysis for extensive identification of ADGs in viruses of archaea and bacteria.


Subject(s)
Archaea , Archaeal Viruses , Archaeal Viruses/genetics , Archaea/genetics , Archaea/virology , Archaea/immunology , Promoter Regions, Genetic/genetics , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Regulatory Sequences, Nucleic Acid/genetics , Viral Proteins/genetics , Archaeal Proteins/genetics , Archaeal Proteins/metabolism , Metagenome/genetics , CRISPR-Associated Proteins/genetics , CRISPR-Associated Proteins/metabolism , CRISPR-Cas Systems/genetics
2.
Sci Rep ; 14(1): 10078, 2024 05 02.
Article in English | MEDLINE | ID: mdl-38698030

ABSTRACT

Comparative analyses between traditional model organisms, such as the fruit fly Drosophila melanogaster, and more recent model organisms, such as the red flour beetle Tribolium castaneum, have provided a wealth of insight into conserved and diverged aspects of gene regulation. While the study of trans-regulatory components is relatively straightforward, the study of cis-regulatory elements (CREs, or enhancers) remains challenging outside of Drosophila. A central component of this challenge has been finding a core promoter suitable for enhancer-reporter assays in diverse insect species. Previously, we demonstrated that a Drosophila Synthetic Core Promoter (DSCP) functions in a cross-species manner in Drosophila and Tribolium. Given the over 300 million years of divergence between the Diptera and Coleoptera, we reasoned that DSCP-based reporter constructs will be useful when studying cis-regulation in a variety of insect models across the holometabola and possibly beyond. To this end, we sought to create a suite of new DSCP-based reporter vectors, leveraging dual compatibility with piggyBac and PhiC31-integration, the 3xP3 universal eye marker, GATEWAY cloning, different colors of reporters and markers, as well as Gal4-UAS binary expression. While all constructs functioned properly with a Tc-nub enhancer in Drosophila, complications arose with tissue-specific Gal4-UAS binary expression in Tribolium. Nevertheless, the functionality of these constructs across multiple holometabolous orders suggests a high potential compatibility with a variety of other insects. In addition, we present the piggyLANDR (piggyBac-LoxP AttP Neutralizable Destination Reporter) platform for the establishment of proper PhiC31 landing sites free from position effects. As a proof-of-principle, we demonstrated the workflow for piggyLANDR in Drosophila. The potential utility of these tools ranges from molecular biology research to pest and disease-vector management, and will help advance the study of gene regulation beyond traditional insect models.


Subject(s)
Drosophila melanogaster , Genes, Reporter , Genetic Vectors , Promoter Regions, Genetic , Tribolium , Animals , Genetic Vectors/genetics , Tribolium/genetics , Drosophila melanogaster/genetics , Enhancer Elements, Genetic , Regulatory Sequences, Nucleic Acid/genetics , Insecta/genetics , Animals, Genetically Modified
3.
Mol Biol Rep ; 51(1): 612, 2024 May 05.
Article in English | MEDLINE | ID: mdl-38704770

ABSTRACT

BACKGROUND: The α-Major Regulatory Element (α-MRE), also known as HS-40, is located upstream of the α-globin gene cluster and has a crucial role in the long-range regulation of the α-globin gene expression. This enhancer is polymorphic and several haplotypes were identified in different populations, with haplotype D almost exclusively found in African populations. The purpose of this research was to identify the HS-40 haplotype associated with the 3.7 kb α-thalassemia deletion (-α3.7del) in the Portuguese population, and determine its ancestry and influence on patients' hematological phenotype. METHODS AND RESULTS: We selected 111 Portuguese individuals previously analyzed by Gap-PCR to detect the presence of the -α3.7del: 50 without the -α3.7del, 34 heterozygous and 27 homozygous for the -α3.7del. The HS-40 region was amplified by PCR followed by Sanger sequencing. Four HS-40 haplotypes were found (A to D). The distribution of HS-40 haplotypes and genotypes are significantly different between individuals with and without the -α3.7del, being haplotype D and genotype AD the most prevalent in patients with this deletion in homozygosity. Furthermore, multiple correspondence analysis revealed that individuals without the -α3.7del are grouped with other European populations, while samples with the -α3.7del are separated from these and found more closely related to the African population. CONCLUSION: This study revealed for the first time an association of the HS-40 haplotype D with the -α3.7del in the Portuguese population, and its likely African ancestry. These results may have clinical importance as in vitro analysis of haplotype D showed a decrease in its enhancer activity on α-globin gene.


Subject(s)
Haplotypes , Sequence Deletion , alpha-Globins , alpha-Thalassemia , Female , Humans , Male , alpha-Globins/genetics , alpha-Thalassemia/genetics , Black People/genetics , Gene Frequency/genetics , Genotype , Haplotypes/genetics , Portugal , Regulatory Sequences, Nucleic Acid/genetics , Sequence Deletion/genetics
5.
Sci Adv ; 10(21): eadj4452, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38781344

ABSTRACT

Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.


Subject(s)
Brain , Epigenesis, Genetic , Regulatory Sequences, Nucleic Acid , Humans , Brain/metabolism , Regulatory Sequences, Nucleic Acid/genetics , Animals , Evolution, Molecular , Mental Disorders/genetics , Regulatory Elements, Transcriptional/genetics , Neurons/metabolism , Gene Expression Regulation , Transcription Factors/genetics , Transcription Factors/metabolism
6.
Biosci Rep ; 44(5)2024 May 29.
Article in English | MEDLINE | ID: mdl-38743016

ABSTRACT

Varicose vein disease (VVD) is a common health problem worldwide. Microfibril-associated protein 5 (MFAP5) is one of the potential key players in its pathogenesis. Our previous microarray analysis revealed the cg06256735 and cg15815843 loci in the regulatory regions of the MFAP5 gene as hypomethylated in varicose veins which correlated with its up-regulation. The aim of this work was to validate preliminary microarray data, estimate the level of 5-hydroxymethylcytosine (5hmC) at these loci, and determine the methylation status of one of them in different layers of the venous wall. For this, methyl- and hydroxymethyl-sensitive restriction techniques were used followed by real-time PCR and droplet digital PCR, correspondingly, as well as bisulfite pyrosequencing of +/- oxidized DNA. Our microarray data on hypomethylation at the cg06256735 and cg15815843 loci in whole varicose vein segments were confirmed and it was also demonstrated that the level of 5hmC at these loci is increased in VVD. Specifically, among other layers of the venous wall, tunica (t.) intima is the main contributor to hypomethylation at the cg06256735 locus in varicose veins. Thus, it was shown that hypomethylation at the cg06256735 and cg15815843 loci takes place in VVD, with evidence to suggest that it happens through their active demethylation leading to up-regulation of the MFAP5 gene, and t. intima is most involved in this biochemical process.


Subject(s)
5-Methylcytosine , DNA Methylation , Varicose Veins , Varicose Veins/genetics , Varicose Veins/metabolism , Humans , Male , Female , Middle Aged , 5-Methylcytosine/analogs & derivatives , 5-Methylcytosine/metabolism , Adult , Aged , Regulatory Sequences, Nucleic Acid/genetics , Genetic Loci
7.
Nat Commun ; 15(1): 3839, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38714659

ABSTRACT

Pre-mRNA splicing, a key process in gene expression, can be therapeutically modulated using various drug modalities, including antisense oligonucleotides (ASOs). However, determining promising targets is hampered by the challenge of systematically mapping splicing-regulatory elements (SREs) in their native sequence context. Here, we use the catalytically inactive CRISPR-RfxCas13d RNA-targeting system (dCas13d/gRNA) as a programmable platform to bind SREs and modulate splicing by competing against endogenous splicing factors. SpliceRUSH, a high-throughput screening method, was developed to map SREs in any gene of interest using a lentivirus gRNA library that tiles the genetic region, including distal intronic sequences. When applied to SMN2, a therapeutic target for spinal muscular atrophy, SpliceRUSH robustly identifies not only known SREs but also a previously unknown distal intronic SRE, which can be targeted to alter exon 7 splicing using either dCas13d/gRNA or ASOs. This technology enables a deeper understanding of splicing regulation with applications for RNA-based drug discovery.


Subject(s)
CRISPR-Cas Systems , Exons , Introns , RNA Splicing , RNA, Guide, CRISPR-Cas Systems , Survival of Motor Neuron 2 Protein , Humans , RNA Splicing/genetics , Survival of Motor Neuron 2 Protein/genetics , RNA, Guide, CRISPR-Cas Systems/genetics , Introns/genetics , Exons/genetics , HEK293 Cells , Oligonucleotides, Antisense/genetics , Muscular Atrophy, Spinal/genetics , Regulatory Sequences, Nucleic Acid/genetics , RNA Precursors/genetics , RNA Precursors/metabolism
8.
BMC Bioinformatics ; 25(1): 179, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38714913

ABSTRACT

BACKGROUND: As genomic studies continue to implicate non-coding sequences in disease, testing the roles of these variants requires insights into the cell type(s) in which they are likely to be mediating their effects. Prior methods for associating non-coding variants with cell types have involved approaches using linkage disequilibrium or ontological associations, incurring significant processing requirements. GaiaAssociation is a freely available, open-source software that enables thousands of genomic loci implicated in a phenotype to be tested for enrichment at regulatory loci of multiple cell types in minutes, permitting insights into the cell type(s) mediating the studied phenotype. RESULTS: In this work, we present Regulatory Landscape Enrichment Analysis (RLEA) by GaiaAssociation and demonstrate its capability to test the enrichment of 12,133 variants across the cis-regulatory regions of 44 cell types. This analysis was completed in 134.0 ± 2.3 s, highlighting the efficient processing provided by GaiaAssociation. The intuitive interface requires only four inputs, offers a collection of customizable functions, and visualizes variant enrichment in cell-type regulatory regions through a heatmap matrix. GaiaAssociation is available on PyPi for download as a command line tool or Python package and the source code can also be installed from GitHub at https://github.com/GreallyLab/gaiaAssociation . CONCLUSIONS: GaiaAssociation is a novel package that provides an intuitive and efficient resource to understand the enrichment of non-coding variants across the cis-regulatory regions of different cells, empowering studies seeking to identify disease-mediating cell types.


Subject(s)
Software , Genetic Variation , Humans , Genomics/methods , Computational Biology/methods , Phenotype , Regulatory Sequences, Nucleic Acid/genetics , Linkage Disequilibrium
9.
Nature ; 629(8010): 127-135, 2024 May.
Article in English | MEDLINE | ID: mdl-38658750

ABSTRACT

Phenotypic variation among species is a product of evolutionary changes to developmental programs1,2. However, how these changes generate novel morphological traits remains largely unclear. Here we studied the genomic and developmental basis of the mammalian gliding membrane, or patagium-an adaptative trait that has repeatedly evolved in different lineages, including in closely related marsupial species. Through comparative genomic analysis of 15 marsupial genomes, both from gliding and non-gliding species, we find that the Emx2 locus experienced lineage-specific patterns of accelerated cis-regulatory evolution in gliding species. By combining epigenomics, transcriptomics and in-pouch marsupial transgenics, we show that Emx2 is a critical upstream regulator of patagium development. Moreover, we identify different cis-regulatory elements that may be responsible for driving increased Emx2 expression levels in gliding species. Lastly, using mouse functional experiments, we find evidence that Emx2 expression patterns in gliders may have been modified from a pre-existing program found in all mammals. Together, our results suggest that patagia repeatedly originated through a process of convergent genomic evolution, whereby regulation of Emx2 was altered by distinct cis-regulatory elements in independently evolved species. Thus, different regulatory elements targeting the same key developmental gene may constitute an effective strategy by which natural selection has harnessed regulatory evolution in marsupial genomes to generate phenotypic novelty.


Subject(s)
Evolution, Molecular , Homeodomain Proteins , Locomotion , Marsupialia , Transcription Factors , Animals , Female , Male , Mice , Epigenomics , Gene Expression Profiling , Gene Expression Regulation, Developmental , Genome/genetics , Genomics , Homeodomain Proteins/genetics , Homeodomain Proteins/metabolism , Locomotion/genetics , Marsupialia/anatomy & histology , Marsupialia/classification , Marsupialia/genetics , Marsupialia/growth & development , Phylogeny , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism , Transcription Factors/genetics , Phenotype , Humans
10.
Nat Commun ; 15(1): 3488, 2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38664394

ABSTRACT

Elucidating the relationship between non-coding regulatory element sequences and gene expression is crucial for understanding gene regulation and genetic variation. We explored this link with the training of interpretable deep learning models predicting gene expression profiles from gene flanking regions of the plant species Arabidopsis thaliana, Solanum lycopersicum, Sorghum bicolor, and Zea mays. With over 80% accuracy, our models enabled predictive feature selection, highlighting e.g. the significant role of UTR regions in determining gene expression levels. The models demonstrated remarkable cross-species performance, effectively identifying both conserved and species-specific regulatory sequence features and their predictive power for gene expression. We illustrated the application of our approach by revealing causal links between genetic variation and gene expression changes across fourteen tomato genomes. Lastly, our models efficiently predicted genotype-specific expression of key functional gene groups, exemplified by underscoring known phenotypic and metabolic differences between Solanum lycopersicum and its wild, drought-resistant relative, Solanum pennellii.


Subject(s)
Arabidopsis , Deep Learning , Gene Expression Regulation, Plant , Solanum lycopersicum , Sorghum , Zea mays , Solanum lycopersicum/genetics , Solanum lycopersicum/metabolism , Sorghum/genetics , Sorghum/metabolism , Arabidopsis/genetics , Arabidopsis/metabolism , Zea mays/genetics , Regulatory Sequences, Nucleic Acid/genetics , Genome, Plant , Genetic Variation , Species Specificity
11.
Transfusion ; 64(6): 1083-1096, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38644556

ABSTRACT

BACKGROUND: Blood typing is essential for safe transfusions and is performed serologically or genetically. Genotyping predominantly focuses on coding regions, but non-coding variants may affect gene regulation, as demonstrated in the ABO, FY and XG systems. To uncover regulatory loci, we expanded a recently developed bioinformatics pipeline for discovery of non-coding variants by including additional epigenetic datasets. METHODS: Multiple datasets including ChIP-seq with erythroid transcription factors (TFs), histone modifications (H3K27ac, H3K4me1), and chromatin accessibility (ATAC-seq) were analyzed. Candidate regulatory regions were investigated for activity (luciferase assays) and TF binding (electrophoretic mobility shift assay, EMSA, and mass spectrometry, MS). RESULTS: In total, 814 potential regulatory sites in 47 blood-group-related genes were identified where one or more erythroid TFs bound. Enhancer candidates in CR1, EMP3, ABCB6, and ABCC4 indicated by ATAC-seq, histone markers, and co-occupancy of 4 TFs (GATA1/KLF1/RUNX1/NFE2) were investigated but only CR1 and ABCC4 showed increased transcription. Co-occupancy of GATA1 and KLF1 was observed in the KEL promoter, previously reported to contain GATA1 and Sp1 sites. TF binding energy scores decreased when three naturally occurring variants were introduced into GATA1 and KLF1 motifs. Two of three GATA1 sites and the KLF1 site were confirmed functionally. EMSA and MS demonstrated increased GATA1 and KLF1 binding to the wild-type compared to variant motifs. DISCUSSION: This combined bioinformatics and experimental approach revealed multiple candidate regulatory regions and predicted TF co-occupancy sites. The KEL promoter was characterized in detail, indicating that two adjacent GATA1 and KLF1 motifs are most crucial for transcription.


Subject(s)
Blood Group Antigens , Epigenesis, Genetic , Humans , Blood Group Antigens/genetics , GATA1 Transcription Factor/genetics , Transcription Factors/genetics , Transcription Factors/metabolism , Regulatory Sequences, Nucleic Acid/genetics , Kruppel-Like Transcription Factors/genetics
12.
Cell Genom ; 4(4): 100536, 2024 Apr 10.
Article in English | MEDLINE | ID: mdl-38604126

ABSTRACT

Gene regulatory divergence between species can result from cis-acting local changes to regulatory element DNA sequences or global trans-acting changes to the regulatory environment. Understanding how these mechanisms drive regulatory evolution has been limited by challenges in identifying trans-acting changes. We present a comprehensive approach to directly identify cis- and trans-divergent regulatory elements between human and rhesus macaque lymphoblastoid cells using assay for transposase-accessible chromatin coupled to self-transcribing active regulatory region (ATAC-STARR) sequencing. In addition to thousands of cis changes, we discover an unexpected number (∼10,000) of trans changes and show that cis and trans elements exhibit distinct patterns of sequence divergence and function. We further identify differentially expressed transcription factors that underlie ∼37% of trans differences and trace how cis changes can produce cascades of trans changes. Overall, we find that most divergent elements (67%) experienced changes in both cis and trans, revealing a substantial role for trans divergence-alone and together with cis changes-in regulatory differences between species.


Subject(s)
Gene Expression Regulation , Regulatory Sequences, Nucleic Acid , Animals , Humans , Macaca mulatta/genetics , Regulatory Sequences, Nucleic Acid/genetics , Gene Expression Regulation/genetics , Transcription Factors/genetics , Chromatin/genetics
13.
Cell Rep ; 43(4): 113983, 2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38517895

ABSTRACT

Transcriptional silencing in Saccharomyces cerevisiae involves the generation of a chromatin state that stably represses transcription. Using multiple reporter assays, a diverse set of upstream activating sequence enhancers and core promoters were investigated for their susceptibility to silencing. We show that heterochromatin stably silences only weak and stress-induced regulatory elements but is unable to stably repress housekeeping gene regulatory elements, and the partial repression of these elements did not result in bistable expression states. Permutation analysis of enhancers and promoters indicates that both elements are targets of repression. Chromatin remodelers help specific regulatory elements to resist repression, most probably by altering nucleosome mobility and changing transcription burst duration. The strong enhancers/promoters can be repressed if silencer-bound Sir1 is increased. Together, our data suggest that the heterochromatic locus has been optimized to stably silence the weak mating-type gene regulatory elements but not strong housekeeping gene regulatory sequences.


Subject(s)
Gene Expression Regulation, Fungal , Gene Silencing , Heterochromatin , Promoter Regions, Genetic , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Heterochromatin/metabolism , Heterochromatin/genetics , Promoter Regions, Genetic/genetics , Enhancer Elements, Genetic/genetics , Saccharomyces cerevisiae Proteins/metabolism , Saccharomyces cerevisiae Proteins/genetics , Regulatory Sequences, Nucleic Acid/genetics , Nucleosomes/metabolism , Nucleosomes/genetics
14.
Sci Rep ; 14(1): 7370, 2024 03 28.
Article in English | MEDLINE | ID: mdl-38548819

ABSTRACT

Class switch recombination (CSR) plays an important role in adaptive immune response by enabling mature B cells to replace the initial IgM by another antibody class (IgG, IgE or IgA). CSR is preceded by transcription of the IgH constant genes and is controlled by the super-enhancer 3' regulatory region (3'RR) in an activation-specific manner. The 3'RR is composed of four enhancers (hs3a, hs1-2, hs3b and hs4). In mature B cells, 3'RR activity correlates with transcription of its enhancers. CSR can also occur in primary developing B cells though at low frequency, but in contrast to mature B cells, the transcriptional elements that regulate the process in developing B cells are ill-known. In particular, the role of the 3'RR in the control of constant genes' transcription and CSR has not been addressed. Here, by using a mouse line devoid of the 3'RR and a culture system that highly enriches in pro-B cells, we show that the 3'RR activity is indeed required for switch transcription and CSR, though its effect varies in an isotype-specific manner and correlates with transcription of hs4 enhancer only.


Subject(s)
Immunoglobulin Heavy Chains , Super Enhancers , Immunoglobulin Heavy Chains/genetics , Regulatory Sequences, Nucleic Acid/genetics , Immunoglobulin Class Switching/genetics , B-Lymphocytes , Immunoglobulin Isotypes/genetics , Enhancer Elements, Genetic
15.
PLoS Genet ; 20(3): e1011174, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38437180

ABSTRACT

A striking paradox is that genes with conserved protein sequence, function and expression pattern over deep time often exhibit extremely divergent cis-regulatory sequences. It remains unclear how such drastic cis-regulatory evolution across species allows preservation of gene function, and to what extent these differences influence how cis-regulatory variation arising within species impacts phenotypic change. Here, we investigated these questions using a plant stem cell regulator conserved in expression pattern and function over ~125 million years. Using in-vivo genome editing in two distantly related models, Arabidopsis thaliana (Arabidopsis) and Solanum lycopersicum (tomato), we generated over 70 deletion alleles in the upstream and downstream regions of the stem cell repressor gene CLAVATA3 (CLV3) and compared their individual and combined effects on a shared phenotype, the number of carpels that make fruits. We found that sequences upstream of tomato CLV3 are highly sensitive to even small perturbations compared to its downstream region. In contrast, Arabidopsis CLV3 function is tolerant to severe disruptions both upstream and downstream of the coding sequence. Combining upstream and downstream deletions also revealed a different regulatory outcome. Whereas phenotypic enhancement from adding downstream mutations was predominantly weak and additive in tomato, mutating both regions of Arabidopsis CLV3 caused substantial and synergistic effects, demonstrating distinct distribution and redundancy of functional cis-regulatory sequences. Our results demonstrate remarkable malleability in cis-regulatory structural organization of a deeply conserved plant stem cell regulator and suggest that major reconfiguration of cis-regulatory sequence space is a common yet cryptic evolutionary force altering genotype-to-phenotype relationships from regulatory variation in conserved genes. Finally, our findings underscore the need for lineage-specific dissection of the spatial architecture of cis-regulation to effectively engineer trait variation from conserved productivity genes in crops.


Subject(s)
Arabidopsis , Arabidopsis/genetics , Regulatory Sequences, Nucleic Acid/genetics , Crops, Agricultural , Alleles , Amino Acid Sequence
16.
Nat Commun ; 15(1): 1600, 2024 Feb 21.
Article in English | MEDLINE | ID: mdl-38383453

ABSTRACT

Cross-species genome comparisons have revealed a substantial number of ultraconserved non-coding elements (UCNEs). Several of these elements have proved to be essential tissue- and cell type-specific cis-regulators of developmental gene expression. Here, we characterize a set of UCNEs as candidate CREs (cCREs) during retinal development and evaluate the contribution of their genomic variation to rare eye diseases, for which pathogenic non-coding variants are emerging. Integration of bulk and single-cell retinal multi-omics data reveals 594 genes under potential cis-regulatory control of UCNEs, of which 45 are implicated in rare eye disease. Mining of candidate cis-regulatory UCNEs in WGS data derived from the rare eye disease cohort of Genomics England reveals 178 ultrarare variants within 84 UCNEs associated with 29 disease genes. Overall, we provide a comprehensive annotation of ultraconserved non-coding regions acting as cCREs during retinal development which can be targets of non-coding variation underlying rare eye diseases.


Subject(s)
Eye Diseases , Multiomics , Humans , Retina/metabolism , Regulatory Sequences, Nucleic Acid/genetics , Genome , Eye Diseases/genetics , Eye Diseases/metabolism
17.
Plant Cell ; 36(6): 2272-2288, 2024 May 29.
Article in English | MEDLINE | ID: mdl-38421027

ABSTRACT

A number of cis-regulatory elements (CREs) conserved during evolution have been found to be responsible for phenotypic novelty and variation. Cucurbit crops such as cucumber (Cucumis sativus), watermelon (Citrullus lanatus), melon (Cucumis melo), and squash (Cucurbita maxima) develop fruits from an inferior ovary and share some similar biological processes during fruit development. Whether conserved regulatory sequences play critical roles in fruit development of cucurbit crops remains to be explored. In six well-studied cucurbit species, we identified 392,438 conserved noncoding sequences (CNSs), including 82,756 that are specific to cucurbits, by comparative genomics. Genome-wide profiling of accessible chromatin regions (ACRs) and gene expression patterns mapped 20,865 to 43,204 ACRs and their potential target genes for two fruit tissues at two key developmental stages in six cucurbits. Integrated analysis of CNSs and ACRs revealed 4,431 syntenic orthologous CNSs, including 1,687 cucurbit-specific CNSs that overlap with ACRs that are present in all six cucurbit crops and that may regulate the expression of 757 adjacent orthologous genes. CRISPR mutations targeting two CNSs present in the 1,687 cucurbit-specific sequences resulted in substantially altered fruit shape and gene expression patterns of adjacent NAC1 (NAM, ATAF1/2, and CUC2) and EXT-like (EXTENSIN-like) genes, validating the regulatory roles of these CNSs in fruit development. These results not only provide a number of target CREs for cucurbit crop improvement, but also provide insight into the roles of CREs in plant biology and during evolution.


Subject(s)
Conserved Sequence , Fruit , Gene Expression Regulation, Plant , Fruit/genetics , Fruit/growth & development , Regulatory Sequences, Nucleic Acid/genetics , Crops, Agricultural/genetics , Crops, Agricultural/growth & development , Cucurbita/genetics , Cucurbita/growth & development , Citrullus/genetics , Citrullus/growth & development , Citrullus/metabolism , Cucumis sativus/genetics , Cucumis sativus/growth & development , Plant Proteins/genetics , Plant Proteins/metabolism , Genome, Plant/genetics
18.
Int J Mol Sci ; 25(3)2024 Feb 05.
Article in English | MEDLINE | ID: mdl-38339181

ABSTRACT

The concept of cis-regulatory modules located in gene promoters represents today's vision of the organization of gene transcriptional regulation. Such modules are a combination of two or more single, short DNA motifs. The bioinformatic identification of such modules belongs to so-called NP-hard problems with extreme computational complexity, and therefore, simplifications, assumptions, and heuristics are usually deployed to tackle the problem. In practice, this requires, first, many parameters to be set before the search, and second, it leads to the identification of locally optimal results. Here, a novel method is presented, aimed at identifying the cis-regulatory elements in gene promoters based on an exhaustive search of all the feasible modules' configurations. All required parameters are automatically estimated using positive and negative datasets. To be computationally efficient, the search is accelerated using a multidimensional hash function, allowing the search to complete in a few hours on a regular laptop (for example, a CPU Intel i7, 3.2 GH, 32 Gb RAM). Tests on an established benchmark and real data show better performance of BestCRM compared to the available methods according to several metrics like specificity, sensitivity, AUC, etc. A great practical advantage of the method is its minimum number of input parameters-apart from positive and negative promoters, only a desired level of module presence in promoters is required.


Subject(s)
Algorithms , Regulatory Sequences, Nucleic Acid , Promoter Regions, Genetic , Regulatory Sequences, Nucleic Acid/genetics , Gene Expression Regulation , Computational Biology/methods
20.
Am J Hum Genet ; 111(2): 259-279, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38232730

ABSTRACT

Tauopathies are a group of neurodegenerative diseases defined by abnormal aggregates of tau, a microtubule-associated protein encoded by MAPT. MAPT expression is near absent in neural progenitor cells (NPCs) and increases during differentiation. This temporally dynamic expression pattern suggests that MAPT expression could be controlled by transcription factors and cis-regulatory elements specific to differentiated cell types. Given the relevance of MAPT expression to neurodegeneration pathogenesis, identification of such elements is relevant to understanding disease risk and pathogenesis. Here, we performed chromatin conformation assays (HiC & Capture-C), single-nucleus multiomics (RNA-seq+ATAC-seq), bulk ATAC-seq, and ChIP-seq for H3K27ac and CTCF in NPCs and differentiated neurons to nominate candidate cis-regulatory elements (cCREs). We assayed these cCREs using luciferase assays and CRISPR interference (CRISPRi) experiments to measure their effects on MAPT expression. Finally, we integrated cCRE annotations into an analysis of genetic variation in neurodegeneration-affected individuals and control subjects. We identified both proximal and distal regulatory elements for MAPT and confirmed the regulatory function for several regions, including three regions centromeric to MAPT beyond the H1/H2 haplotype inversion breakpoint. We also found that rare and predicted damaging genetic variation in nominated CREs was nominally depleted in dementia-affected individuals relative to control subjects, consistent with the hypothesis that variants that disrupt MAPT enhancer activity, and thereby reduced MAPT expression, may be protective against neurodegenerative disease. Overall, this study provides compelling evidence for pursuing detailed knowledge of CREs for genes of interest to permit better understanding of disease risk.


Subject(s)
Neurodegenerative Diseases , tau Proteins , Humans , Chromatin/genetics , Haplotypes , Neurodegenerative Diseases/genetics , Neurons , Regulatory Sequences, Nucleic Acid/genetics , tau Proteins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...