Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2024 May 12.
Article in English | MEDLINE | ID: mdl-38766093

ABSTRACT

Analysis of factors that lead to the functionality of transcriptional activation domains remains a crucial and yet challenging task owing to the significant diversity in their sequences and their intrinsically disordered nature. Almost all existing methods that have aimed to predict activation domains have involved traditional machine learning approaches, such as logistic regression, that are unable to capture complex patterns in data or plain convolutional neural networks and have been limited in exploration of structural features. However, there is a tremendous potential in the inspection of the structural properties of activation domains, and an opportunity to investigate complex relationships between features of residues in the sequence. To address these, we have utilized the power of graph neural networks which can represent structural data in the form of nodes and edges, allowing nodes to exchange information among themselves. We have experimented with two kinds of graph formulations, one involving residues as nodes and the other assigning atoms to be the nodes. A logistic regression model was also developed to analyze feature importance. For all the models, several feature combinations were experimented with. The residue-level GNN model with amino acid type, residue position, acidic/basic/aromatic property and secondary structure feature combination gave the best performing model with accuracy, F1 score and AUROC of 97.9%, 71% and 97.1% respectively which outperformed other existing methods in the literature when applied on the dataset we used. Among the other structure-based features that were analyzed, the amphipathic property of helices also proved to be an important feature for classification. Logistic regression results showed that the most dominant feature that makes a sequence functional is the frequency of different types of amino acids in the sequence. Our results consistent have shown that functional sequences have more acidic and aromatic residues whereas basic residues are seen more in non-functional sequences.

2.
iScience ; 24(9): 103017, 2021 Sep 24.
Article in English | MEDLINE | ID: mdl-34522860

ABSTRACT

The mechanisms by which transcriptional activation domains (tADs) initiate eukaryotic gene expression have been an enigma for decades because most tADs lack specificity in sequence, structure, and interactions with targets. Machine learning analysis of data sets of tAD sequences generated in vivo elucidated several functionality rules: the functional tAD sequences should (i) be devoid of or depleted with basic amino acid residues, (ii) be enriched with aromatic and acidic residues, (iii) be with aromatic residues localized mostly near the terminus of the sequence, and acidic residues localized more internally within a span of 20-30 amino acids, (iv) be with both aromatic and acidic residues preferably spread out in the sequence and not clustered, and (v) not be separated by occasional basic residues. These and other more subtle rules are not absolute, reflecting absence of a tAD consensus sequence, enormous variability, and consistent with surfactant-like tAD biochemical properties. The findings are compatible with the paradigm-shifting nucleosome detergent mechanism of gene expression activation, contributing to the development of the liquid-liquid phase separation model and the biochemistry of near-stochastic functional allosteric interactions.

3.
Trends Biochem Sci ; 43(12): 951-959, 2018 12.
Article in English | MEDLINE | ID: mdl-30297207

ABSTRACT

The transcriptional activation domains (TADs) are critical for life, yet intrinsically disordered polypeptides with no specific consensus sequence, interacting with multiple targets via low-specificity fuzzy contacts. The recent integration of machine learning approaches in biochemistry allows analysis of large experimental datasets of functional TADs as a whole and clear observation of TAD features. The emerging picture describes TADs as sequences without consensus but with a variety of detergent-like mini-motifs enriched in negatively charged and aromatic amino acids. Comparison of the canonical direct coactivator recruitment model and a new model describing TADs as nucleosome detergents that trigger chromatin remodeling during gene activation helps solve a fundamental enigma of molecular biology spanning 30 years.


Subject(s)
Nucleosomes/metabolism , Animals , Chromatin/metabolism , Chromatin Assembly and Disassembly , Humans , Machine Learning
4.
Mol Syst Biol ; 14(5): e8190, 2018 05 14.
Article in English | MEDLINE | ID: mdl-29759983

ABSTRACT

Over 40% of proteins in any eukaryotic genome encode intrinsically disordered regions (IDRs) that do not adopt defined tertiary structures. Certain IDRs perform critical functions, but discovering them is non-trivial as the biological context determines their function. We present IDR-Screen, a framework to discover functional IDRs in a high-throughput manner by simultaneously assaying large numbers of DNA sequences that code for short disordered sequences. Functionality-conferring patterns in their protein sequence are inferred through statistical learning. Using yeast HSF1 transcription factor-based assay, we discovered IDRs that function as transactivation domains (TADs) by screening a random sequence library and a designed library consisting of variants of 13 diverse TADs. Using machine learning, we find that segments devoid of positively charged residues but with redundant short sequence patterns of negatively charged and aromatic residues are a generic feature for TAD functionality. We anticipate that investigating defined sequence libraries using IDR-Screen for specific functions can facilitate discovering novel and functional regions of the disordered proteome as well as understand the impact of natural and disease variants in disordered segments.


Subject(s)
DNA-Binding Proteins/genetics , Heat-Shock Proteins/genetics , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/genetics , Transcription Factors/genetics , Transcriptional Activation , Cloning, Molecular , Gene Library , High-Throughput Nucleotide Sequencing , Machine Learning , Proteome/genetics , Sequence Analysis, DNA
5.
Article in English | MEDLINE | ID: mdl-27679670

ABSTRACT

After more than three decades since the discovery of transcription activation domains (ADs) in gene-specific activators, the mechanism of their function remains enigmatic. The widely accepted model of direct recruitment by ADs of co-activators and basal transcriptional machinery components, however, is not always compatible with the short size yet very high degree of sequence randomness and intrinsic structural disorder of natural and synthetic ADs. In this review, we formulate the basis for an alternative and complementary model, whereby sequence randomness and intrinsic structural disorder of ADs are necessary for transient distorting interactions with promoter nucleosomes, triggering promoter nucleosome translocation and subsequently gene activation.

6.
Cell Stress Chaperones ; 20(5): 833-41, 2015 Sep.
Article in English | MEDLINE | ID: mdl-26003133

ABSTRACT

Development of novel anti-cancer drug leads that target regulators of protein homeostasis is a formidable task in modern pharmacology. Finding specific inhibitors of human Heat Shock Factor 1 (hHSF1) has proven to be a challenging task, while screening for inhibitors of human Heat Shock Factor 2 (hHSF2) has never been described. We report the development of a novel system based on an in vivo cell growth restoration assay designed to identify specific inhibitors of human HSF2 in a high-throughput format. This system utilizes a humanized yeast strain in which the master regulator of molecular chaperone genes, yeast HSF, has been replaced with hHSF2 with no detrimental effect on cell growth. This replacement preserves the general regulatory patterns of genes encoding major molecular chaperones including Hsp70 and Hsp90. The controlled overexpression of hHSF2 creates a slow-growth phenotype, which is the basis of the growth restoration assay used for high-throughput screening. The phenotype is most robust when cells are cultured at 25 °C, while incubation at temperatures greater than 30 °C leads to compensation of the phenotype. Overexpression of hHSF2 causes overexpression of molecular chaperones which is a likely cause of the slowed growth. Our assay is characterized by two unique advantages. First, screening takes place in physiologically relevant, in vivo conditions. Second, hits in our screen will be of medically relevant potency, as compounds that completely inhibit hHSF2 function will further inhibit cell growth and therefore will not be scored as hits. This caveat biases our screening system for compounds capable of restoring hHSF2 activity to a physiologically normal level without completely inhibiting this essential system.


Subject(s)
Heat-Shock Proteins/genetics , High-Throughput Screening Assays/methods , Transcription Factors/genetics , Heat-Shock Proteins/antagonists & inhibitors , Humans , Molecular Chaperones/metabolism , Organisms, Genetically Modified , Saccharomyces cerevisiae , Transcription Factors/antagonists & inhibitors
7.
Methods Mol Biol ; 809: 279-89, 2012.
Article in English | MEDLINE | ID: mdl-22113283

ABSTRACT

Investigation of DNA-protein interactions is a key approach in understanding mechanisms of gene regulation. The method described allows detection of dynamic DNA-protein interactions occurring at gene promoters in living cells during the time scale of seconds and minutes. The combination of chromatin immunoprecipitation with real-time PCR allows for detection of changes in activator and co-activator content of any promoter during transcriptional activation. The described method is most applicable to investigation of processes resulting in nucleosome loss at gene promoters during the induction of transcription. The approach is also applicable to any dynamic process involving DNA-protein interactions.


Subject(s)
Chromatin Assembly and Disassembly/physiology , Chromatin Immunoprecipitation/methods , Real-Time Polymerase Chain Reaction/methods , Chromatin Assembly and Disassembly/genetics , Heat-Shock Proteins/genetics , Heat-Shock Proteins/metabolism , Histones/metabolism , Protein Binding , RNA Polymerase II/genetics , RNA Polymerase II/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism , Transcriptional Activation
8.
In Silico Biol ; 9(5-6): 379-89, 2009.
Article in English | MEDLINE | ID: mdl-22430439

ABSTRACT

Cellular stress responses are characterized by coordinated transcriptional induction of genes encoding a group of conserved proteins known as molecular chaperones, most of which are also known as heat shock proteins (HSPs). In S. cerevisiae, transcriptional responses to stress are mediated via two trans-regulatory activators: heat shock transcription factors (HSFs) that bind to heat shock elements (HSEs), and the Msn2 and Msn4 transcription factors that bind to stress response elements (STREs). Recent studies in S. cerevisiae demonstrated that a significant portion of the non-coding region in the genome is transcribed and this intergenic transcription could regulate the transcription of adjacent genes by transcription interference. The goal of this study was to analyze the genomic distribution of HSF and Msn2/4 binding sites and to study the potential for transcription interference regulated by stress response systems. Our genome-wide analysis revealed that 297 genes have STREs in their promoter region, whereas 310 genes contained HSEs. Twenty-five genes had both HSEs and STREs in their promoters. The first set of genes is potentially regulated by the Msn2/Msn4/STRE interaction. For the second set of genes, regulation by heat shock could be mediated through HSF/HSE regulatory mechanisms. The overlap between these groups suggests a co-regulation by the two pathways. Our study yielded 239 candidate genes, whose regulation could potentially be affected by heat-shock via transcription interference directed both from upstream and downstream areas relative to the native promoters. In addition we have categorized 924 genes containing HSE and/or STRE elements within the Open Reading Frames (ORFs), which may also affect normal transcription. Our study revealed a widespread possibility for the regulation of genes via transcriptional interference initiated by stress response. We provided a categorization of genes potentially affected at the transcriptional level by known stress-response systems.


Subject(s)
Gene Expression Regulation, Fungal , Genome, Fungal/genetics , Response Elements/genetics , Saccharomyces cerevisiae/genetics , Stress, Physiological/genetics , Transcription, Genetic , Apoptosis/genetics , Base Sequence , Binding Sites/genetics , Down-Regulation/genetics , Genes, Fungal/genetics , Heat-Shock Response/genetics , Hydrogen Bonding , Models, Genetic , Molecular Sequence Data , Open Reading Frames/genetics
9.
Mol Cell Biol ; 28(4): 1207-17, 2008 Feb.
Article in English | MEDLINE | ID: mdl-18070923

ABSTRACT

The stress response in yeast cells is regulated by at least two classes of transcription activators-HSF and Msn2/4, which differentially affect promoter chromatin remodeling. We demonstrate that the deletion of SNF2, an ATPase activity-containing subunit of the chromatin remodeling SWI/SNF complex, eliminates histone displacement, RNA polymerase II recruitment, and heat shock factor (HSF) binding at the HSP12 promoter while delaying these processes at the HSP82 and SSA4 promoters. Out of the three promoters, the double deletion of MSN2 and MSN4 eliminates both chromatin remodeling and HSF binding only at the HSP12 promoter, suggesting that Msn2/4 activators are primary determinants of chromatin disassembly at the HSP12 promoter. Unexpectedly, during heat shock the level of Msn2/4 at the HSP12 promoter declines. This is likely a result of promoter-targeted Msn2/4 degradation associated with transcription complex assembly. While histone displacement kinetic profiles bear clear promoter specificity, the kinetic profiles of recovery from heat shock for all analyzed genes display an equal or even higher nucleosome return rate, which is to some extent delayed by the deletion of SNF2.


Subject(s)
Chromosomal Proteins, Non-Histone/metabolism , Gene Expression Regulation, Fungal , Heat-Shock Proteins/genetics , Nucleosomes/metabolism , Promoter Regions, Genetic/genetics , Saccharomyces cerevisiae Proteins/metabolism , Saccharomyces cerevisiae/genetics , Transcription Factors/metabolism , Chromatin Assembly and Disassembly , DNA-Binding Proteins/deficiency , DNA-Binding Proteins/metabolism , Gene Deletion , Genes, Fungal , Heat-Shock Response , Kinetics , Protein Binding , RNA Polymerase II/metabolism , Saccharomyces cerevisiae/metabolism , Transcription Factors/deficiency , Transcription, Genetic
10.
Biochem Cell Biol ; 82(4): 453-9, 2004 Aug.
Article in English | MEDLINE | ID: mdl-15284898

ABSTRACT

Activation domains of promoter-specific transcription factors are critical entities involved in recruitment of multiple protein complexes to gene promoters. The activation domains often retain functionality when transferred between very diverse eukaryotic phyla, yet the amino acid sequences of activation domains do not bear any specific consensus or secondary structure. Activation domains function in the context of chromatin structure and are critical for chromatin remodeling, which is associated with transcription initiation. The mechanisms of direct and indirect recruitment of chromatin-remodeling and histone-modifying complexes, including mechanisms involving direct interactions between activation domains and histones, are discussed.


Subject(s)
Histones/chemistry , Transcription Factors/chemistry , Animals , Chromatin/chemistry , Chromatin/metabolism , DNA/chemistry , Humans , Models, Biological , Nucleosomes/chemistry , Promoter Regions, Genetic , Protein Binding , Protein Structure, Tertiary , Transcription, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...