Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
J Biol Chem ; 293(19): 7476-7485, 2018 05 11.
Article in English | MEDLINE | ID: mdl-29523690

ABSTRACT

Proteins with domains that recognize and bind post-translational modifications (PTMs) of histones are collectively termed epigenetic readers. Numerous interactions between specific reader protein domains and histone PTMs and their regulatory outcomes have been reported, but little is known about how reader proteins may in turn be modulated by these interactions. Tripartite motif-containing protein 24 (TRIM24) is a histone reader aberrantly expressed in multiple cancers. Here, our investigation revealed functional cross-talk between histone acetylation and TRIM24 SUMOylation. Binding of TRIM24 to chromatin via its tandem PHD-bromodomain, which recognizes unmethylated lysine 4 and acetylated lysine 23 of histone H3 (H3K4me0/K23ac), led to TRIM24 SUMOylation at lysine residues 723 and 741. Inactivation of the bromodomain, either by mutation or with a small-molecule inhibitor, IACS-9571, abolished TRIM24 SUMOylation. Conversely, inhibition of histone deacetylation markedly increased TRIM24's interaction with chromatin and its SUMOylation. Of note, gene expression profiling of MCF7 cells expressing WT versus SUMO-deficient TRIM24 identified cell adhesion as the major pathway regulated by the cross-talk between chromatin acetylation and TRIM24 SUMOylation. In conclusion, our findings establish a new link between histone H3 acetylation and SUMOylation of the reader protein TRIM24, a functional connection that may bear on TRIM24's oncogenic function and may inform future studies of PTM cross-talk between histones and epigenetic regulators.


Subject(s)
Carrier Proteins/metabolism , Cell Adhesion , Chromatin/metabolism , Sumoylation , Acetylation , Carrier Proteins/chemistry , Epigenesis, Genetic , HEK293 Cells , Histones/metabolism , Humans , MCF-7 Cells , Oncogenes , Protein Processing, Post-Translational
2.
Stem Cell Reports ; 9(6): 2065-2080, 2017 12 12.
Article in English | MEDLINE | ID: mdl-29198826

ABSTRACT

Reprogramming to induced pluripotent stem cells (iPSCs) and differentiation of pluripotent stem cells (PSCs) are regulated by epigenetic machinery. Tripartite motif protein 28 (TRIM28), a universal mediator of Krüppel-associated box domain zinc fingers (KRAB-ZNFs), is known to regulate both processes; however, the exact mechanism and identity of participating KRAB-ZNF genes remain unknown. Here, using a reporter system, we show that TRIM28/KRAB-ZNFs alter DNA methylation patterns in addition to H3K9me3 to cause stable gene repression during reprogramming. Using several expression datasets, we identified KRAB-ZNFs (ZNF114, ZNF483, ZNF589) in the human genome that maintain pluripotency. Moreover, we identified target genes repressed by these KRAB-ZNFs. Mechanistically, we demonstrated that these KRAB-ZNFs directly alter gene expression of important developmental genes by modulating H3K9me3 and DNA methylation of their promoters. In summary, TRIM28 employs KRAB-ZNFs to evoke epigenetic silencing of its target differentiation genes via H3K9me3 and DNA methylation.


Subject(s)
Cell Differentiation/genetics , Pluripotent Stem Cells/metabolism , Repressor Proteins/genetics , Tripartite Motif-Containing Protein 28/genetics , Binding Sites , Cell Self Renewal/genetics , Cellular Reprogramming/genetics , DNA Methylation/genetics , Epigenetic Repression , Gene Expression Regulation, Developmental/genetics , Histone-Lysine N-Methyltransferase/genetics , Humans , Pluripotent Stem Cells/cytology , Promoter Regions, Genetic
3.
Oncotarget ; 8(1): 863-882, 2017 Jan 03.
Article in English | MEDLINE | ID: mdl-27845900

ABSTRACT

The expression of Tripartite motif-containing protein 28 (TRIM28)/Krüppel-associated box (KRAB)-associated protein 1 (KAP1), is elevated in at least 14 tumor types, including solid and hematopoietic tumors. High level of TRIM28 is associated with triple-negative subtype of breast cancer (TNBC), which shows higher aggressiveness and lower survival rates. Interestingly, TRIM28 is essential for maintaining the pluripotent phenotype in embryonic stem cells. Following on that finding, we evaluated the role of TRIM28 protein in the regulation of breast cancer stem cells (CSC) populations and tumorigenesis in vitro and in vivo. Downregulation of TRIM28 expression in xenografts led to deceased expression of pluripotency and mesenchymal markers, as well as inhibition of signaling pathways involved in the complex mechanism of CSC maintenance. Moreover, TRIM28 depletion reduced the ability of cancer cells to induce tumor growth when subcutaneously injected in limiting dilutions. Our data demonstrate that the downregulation of TRIM28 gene expression reduced the ability of CSCs to self-renew that resulted in significant reduction of tumor growth. Loss of function of TRIM28 leads to dysregulation of cell cycle, cellular response to stress, cancer cell metabolism, and inhibition of oxidative phosphorylation. All these mechanisms directly regulate maintenance of CSC population. Our original results revealed the role of the TRIM28 in regulating the CSC population in breast cancer. These findings may pave the way to novel and more effective therapies targeting cancer stem cells in breast tumors.


Subject(s)
Breast Neoplasms/etiology , Breast Neoplasms/metabolism , Cell Transformation, Neoplastic/metabolism , Neoplastic Stem Cells/metabolism , Tripartite Motif-Containing Protein 28/metabolism , Animals , Biomarkers , Breast Neoplasms/mortality , Breast Neoplasms/pathology , Cell Line, Tumor , Cell Proliferation , Cell Survival/genetics , Cell Transformation, Neoplastic/genetics , Disease Models, Animal , Disease Progression , Energy Metabolism , Female , Gene Expression , Gene Knockdown Techniques , Heterografts , Humans , Mice , Neoplasm Metastasis , Oxidative Phosphorylation , Prognosis , Proportional Hazards Models , Recurrence , Signal Transduction , Tripartite Motif-Containing Protein 28/chemistry , Tripartite Motif-Containing Protein 28/genetics , Triple Negative Breast Neoplasms/etiology , Triple Negative Breast Neoplasms/metabolism , Triple Negative Breast Neoplasms/pathology
4.
Cancer Res ; 75(18): 3865-3878, 2015 Sep 15.
Article in English | MEDLINE | ID: mdl-26139243

ABSTRACT

The SWI/SNF multisubunit complex modulates chromatin structure through the activity of two mutually exclusive catalytic subunits, SMARCA2 and SMARCA4, which both contain a bromodomain and an ATPase domain. Using RNAi, cancer-specific vulnerabilities have been identified in SWI/SNF-mutant tumors, including SMARCA4-deficient lung cancer; however, the contribution of conserved, druggable protein domains to this anticancer phenotype is unknown. Here, we functionally deconstruct the SMARCA2/4 paralog dependence of cancer cells using bioinformatics, genetic, and pharmacologic tools. We evaluate a selective SMARCA2/4 bromodomain inhibitor (PFI-3) and characterize its activity in chromatin-binding and cell-functional assays focusing on cells with altered SWI/SNF complex (e.g., lung, synovial sarcoma, leukemia, and rhabdoid tumors). We demonstrate that PFI-3 is a potent, cell-permeable probe capable of displacing ectopically expressed, GFP-tagged SMARCA2-bromodomain from chromatin, yet contrary to target knockdown, the inhibitor fails to display an antiproliferative phenotype. Mechanistically, the lack of pharmacologic efficacy is reconciled by the failure of bromodomain inhibition to displace endogenous, full-length SMARCA2 from chromatin as determined by in situ cell extraction, chromatin immunoprecipitation, and target gene expression studies. Furthermore, using inducible RNAi and cDNA complementation (bromodomain- and ATPase-dead constructs), we unequivocally identify the ATPase domain, and not the bromodomain of SMARCA2, as the relevant therapeutic target with the catalytic activity suppressing defined transcriptional programs. Taken together, our complementary genetic and pharmacologic studies exemplify a general strategy for multidomain protein drug-target validation and in case of SMARCA2/4 highlight the potential for drugging the more challenging helicase/ATPase domain to deliver on the promise of synthetic-lethality therapy.


Subject(s)
Azabicyclo Compounds/pharmacology , Chromatin Assembly and Disassembly/drug effects , Chromosomal Proteins, Non-Histone/deficiency , DNA Helicases/antagonists & inhibitors , Molecular Targeted Therapy , Neoplasm Proteins/antagonists & inhibitors , Neoplasms/drug therapy , Nuclear Proteins/antagonists & inhibitors , Pyridines/pharmacology , Transcription Factors/antagonists & inhibitors , Transcription Factors/deficiency , Binding, Competitive , Catalysis , Cell Line, Tumor , Chromatin/metabolism , Chromosomal Proteins, Non-Histone/genetics , DNA Helicases/chemistry , DNA Helicases/deficiency , DNA, Complementary/genetics , Gene Knockout Techniques , Genetic Complementation Test , Humans , Lung Neoplasms/pathology , Microarray Analysis , Neoplasms/genetics , Nuclear Proteins/chemistry , Nuclear Proteins/deficiency , Protein Structure, Tertiary , RNA Interference , RNA, Small Interfering/pharmacology , Rhabdoid Tumor/genetics , Rhabdoid Tumor/pathology , Sarcoma, Synovial/genetics , Sarcoma, Synovial/pathology , Transcription Factors/chemistry , Transcription Factors/genetics
5.
Contemp Oncol (Pozn) ; 19(1A): A78-91, 2015.
Article in English | MEDLINE | ID: mdl-25691827

ABSTRACT

Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets.

6.
Contemp Oncol (Pozn) ; 19(1A): 1-2, 2015.
Article in English | MEDLINE | ID: mdl-28190078
7.
Sci Signal ; 7(326): ra47, 2014 May 20.
Article in English | MEDLINE | ID: mdl-24847116

ABSTRACT

Androgen deprivation is the standard treatment for advanced prostate cancer (PCa), but most patients ultimately develop resistance and tumor recurrence. We found that MYB is transcriptionally activated by androgen deprivation therapy or genetic silencing of the androgen receptor (AR). MYB silencing inhibited PCa growth in culture and xenografts in mice. Microarray data revealed that c-Myb and AR shared a subset of target genes that encode DNA damage response (DDR) proteins, suggesting that c-Myb may supplant AR as the dominant regulator of their common DDR target genes in AR inhibition-resistant or AR-negative PCa. Gene signatures including AR, MYB, and their common DDR-associated target genes positively correlated with metastasis, castration resistance, tumor recurrence, and decreased survival in PCa patients. In culture and in xenograft-bearing mice, a combination strategy involving the knockdown of MYB, BRCA1, or TOPBP1 or the abrogation of cell cycle checkpoint arrest with AZD7762, an inhibitor of the checkpoint kinase Chk1, increased the cytotoxicity of the poly[adenosine 5'-diphosphate (ADP)-ribose] polymerase (PARP) inhibitor olaparib in PCa cells. Our results reveal new mechanism-based therapeutic approaches for PCa by targeting PARP and the DDR pathway involving c-Myb, TopBP1, ataxia telangiectasia mutated- and Rad3-related (ATR), and Chk1.


Subject(s)
DNA Damage , Phthalazines/pharmacology , Piperazines/pharmacology , Poly(ADP-ribose) Polymerase Inhibitors , Prostatic Neoplasms/drug therapy , Proto-Oncogene Proteins c-myb/antagonists & inhibitors , Thiophenes/pharmacology , Urea/analogs & derivatives , Animals , BRCA1 Protein/genetics , BRCA1 Protein/metabolism , Carrier Proteins/genetics , Carrier Proteins/metabolism , Castration , Cell Cycle Checkpoints/drug effects , Cell Cycle Checkpoints/genetics , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Humans , Male , Mice , Mice, Nude , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Poly(ADP-ribose) Polymerases/genetics , Poly(ADP-ribose) Polymerases/metabolism , Prostatic Neoplasms/genetics , Prostatic Neoplasms/metabolism , Prostatic Neoplasms/pathology , Proto-Oncogene Proteins c-myb/genetics , Proto-Oncogene Proteins c-myb/metabolism , Receptors, Androgen/genetics , Receptors, Androgen/metabolism , Urea/pharmacology , Xenograft Model Antitumor Assays
8.
BMC Genomics ; 14: 672, 2013 Oct 02.
Article in English | MEDLINE | ID: mdl-24088394

ABSTRACT

BACKGROUND: Multiple myeloma (MM) is a malignant proliferation of plasma B cells. Based on recurrent aneuploidy such as copy number alterations (CNAs), myeloma is divided into two subtypes with different CNA patterns and patient survival outcomes. How aneuploidy events arise, and whether they contribute to cancer cell evolution are actively studied. The large amount of transcriptomic changes resultant of CNAs (dosage effect) pose big challenges for identifying functional consequences of CNAs in myeloma in terms of specific driver genes and pathways. In this study, we hypothesize that gene-wise dosage effect varies as a result from complex regulatory networks that translate the impact of CNAs to gene expression, and studying this variation can provide insights into functional effects of CNAs. RESULTS: We propose gene-wise dosage effect score and genome-wide karyotype plot as tools to measure and visualize concordant copy number and expression changes across cancer samples. We find that dosage effect in myeloma is widespread yet variable, and it is correlated with gene expression level and CNA frequencies in different chromosomes. Our analysis suggests that despite the enrichment of differentially expressed genes between hyperdiploid MM and non-hyperdiploid MM in the trisomy chromosomes, the chromosomal proportion of dosage sensitive genes is higher in the non-trisomy chromosomes. Dosage-sensitive genes are enriched by genes with protein translation and localization functions, and dosage resistant genes are enriched by apoptosis genes. These results point to future studies on differential dosage sensitivity and resistance of pro- and anti-proliferation pathways and their variation across patients as therapeutic targets and prognosis markers. CONCLUSIONS: Our findings support the hypothesis that recurrent CNAs in myeloma are selected by their functional consequences. The novel dosage effect score defined in this work will facilitate integration of copy number and expression data for identifying driver genes in cancer genomics studies. The accompanying R code is available at http://www.canevolve.org/dosageEffect/.


Subject(s)
Gene Dosage/genetics , Multiple Myeloma/genetics , Chromosomes, Human/genetics , Cluster Analysis , DNA Copy Number Variations/genetics , Databases, Genetic , Diploidy , Exons/genetics , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Genes, Neoplasm/genetics , Genetic Heterogeneity , Humans , Karyotyping , Oligonucleotide Array Sequence Analysis , Polymorphism, Single Nucleotide/genetics , Trisomy/genetics
9.
PLoS One ; 8(3): e58809, 2013.
Article in English | MEDLINE | ID: mdl-23554930

ABSTRACT

Multiple myeloma (MM) is a cancer of antibody-making plasma cells. It frequently harbors alterations in DNA and chromosome copy numbers, and can be divided into two major subtypes, hyperdiploid (HMM) and non-hyperdiploid multiple myeloma (NHMM). The two subtypes have different survival prognosis, possibly due to different but converging paths to oncogenesis. Existing methods for identifying the two subtypes are fluorescence in situ hybridization (FISH) and copy number microarrays, with increased cost and sample requirements. We hypothesize that chromosome alterations have their imprint in gene expression through dosage effect. Using five MM expression datasets that have HMM status measured by FISH and copy number microarrays, we have developed and validated a K-nearest-neighbor method to classify MM into HMM and NHMM based on gene expression profiles. Classification accuracy for test datasets ranges from 0.83 to 0.88. This classification will enable researchers to study differences and commonalities of the two MM subtypes in disease biology and prognosis using expression datasets without need for additional subtype measurements. Our study also supports the advantages of using cancer specific characteristics in feature design and pooling multiple rounds of classification results to improve accuracy. We provide R source code and processed datasets at www.ChengLiLab.org/software.


Subject(s)
Gene Expression Profiling , Multiple Myeloma/genetics , Polyploidy , Gene Dosage , Humans , In Situ Hybridization, Fluorescence , Multiple Myeloma/diagnosis , Multiple Myeloma/mortality , Reproducibility of Results , Trisomy
10.
PLoS One ; 8(2): e56228, 2013.
Article in English | MEDLINE | ID: mdl-23418540

ABSTRACT

BACKGROUND & OBJECTIVE: Genome-wide profiles of tumors obtained using functional genomics platforms are being deposited to the public repositories at an astronomical scale, as a result of focused efforts by individual laboratories and large projects such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium. Consequently, there is an urgent need for reliable tools that integrate and interpret these data in light of current knowledge and disseminate results to biomedical researchers in a user-friendly manner. We have built the canEvolve web portal to meet this need. RESULTS: canEvolve query functionalities are designed to fulfill most frequent analysis needs of cancer researchers with a view to generate novel hypotheses. canEvolve stores gene, microRNA (miRNA) and protein expression profiles, copy number alterations for multiple cancer types, and protein-protein interaction information. canEvolve allows querying of results of primary analysis, integrative analysis and network analysis of oncogenomics data. The querying for primary analysis includes differential gene and miRNA expression as well as changes in gene copy number measured with SNP microarrays. canEvolve provides results of integrative analysis of gene expression profiles with copy number alterations and with miRNA profiles as well as generalized integrative analysis using gene set enrichment analysis. The network analysis capability includes storage and visualization of gene co-expression, inferred gene regulatory networks and protein-protein interaction information. Finally, canEvolve provides correlations between gene expression and clinical outcomes in terms of univariate survival analysis. CONCLUSION: At present canEvolve provides different types of information extracted from 90 cancer genomics studies comprising of more than 10,000 patients. The presence of multiple data types, novel integrative analysis for identifying regulators of oncogenesis, network analysis and ability to query gene lists/pathways are distinctive features of canEvolve. canEvolve will facilitate integrative and meta-analysis of oncogenomics datasets. AVAILABILITY: The canEvolve web portal is available at http://www.canevolve.org/.


Subject(s)
Genomics/methods , Internet , Oncogenes/genetics , Software , DNA Copy Number Variations , Databases, Nucleic Acid , Gene Expression Profiling , Gene Regulatory Networks , Humans , Information Storage and Retrieval/methods , MicroRNAs/genetics , Neoplasms/genetics , Polymorphism, Single Nucleotide , Reproducibility of Results , User-Computer Interface
11.
Cancer Cell ; 22(3): 345-58, 2012 Sep 11.
Article in English | MEDLINE | ID: mdl-22975377

ABSTRACT

Bortezomib therapy has proven successful for the treatment of relapsed/refractory, relapsed, and newly diagnosed multiple myeloma (MM); however, dose-limiting toxicities and the development of resistance limit its long-term utility. Here, we show that P5091 is an inhibitor of deubiquitylating enzyme USP7, which induces apoptosis in MM cells resistant to conventional and bortezomib therapies. Biochemical and genetic studies show that blockade of HDM2 and p21 abrogates P5091-induced cytotoxicity. In animal tumor model studies, P5091 is well tolerated, inhibits tumor growth, and prolongs survival. Combining P5091 with lenalidomide, HDAC inhibitor SAHA, or dexamethasone triggers synergistic anti-MM activity. Our preclinical study therefore supports clinical evaluation of USP7 inhibitor, alone or in combination, as a potential MM therapy.


Subject(s)
Antineoplastic Agents/pharmacology , Apoptosis/drug effects , Boronic Acids/pharmacology , Multiple Myeloma/drug therapy , Pyrazines/pharmacology , Thiophenes/pharmacology , Ubiquitin Thiolesterase/antagonists & inhibitors , Animals , Antineoplastic Agents/therapeutic use , Antineoplastic Combined Chemotherapy Protocols/pharmacology , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Boronic Acids/therapeutic use , Bortezomib , Cell Line, Tumor , Cyclin-Dependent Kinase Inhibitor p21/antagonists & inhibitors , Dexamethasone/pharmacology , Dexamethasone/therapeutic use , Drug Resistance, Neoplasm/drug effects , Drug Therapy, Combination , Humans , Lenalidomide , Mice , Mice, SCID , Molecular Sequence Data , Multiple Myeloma/enzymology , Multiple Myeloma/pathology , Neovascularization, Pathologic/drug therapy , Protease Inhibitors/pharmacology , Protease Inhibitors/therapeutic use , Proto-Oncogene Proteins c-mdm2/metabolism , Pyrazines/therapeutic use , Random Allocation , Thalidomide/analogs & derivatives , Thalidomide/pharmacology , Thalidomide/therapeutic use , Thiophenes/therapeutic use , Ubiquitin Thiolesterase/genetics , Ubiquitin-Specific Peptidase 7 , Xenograft Model Antitumor Assays
12.
Nucleic Acids Res ; 40(17): e135, 2012 Sep 01.
Article in English | MEDLINE | ID: mdl-22645320

ABSTRACT

We describe here a novel method for integrating gene and miRNA expression profiles in cancer using feed-forward loops (FFLs) consisting of transcription factors (TFs), miRNAs and their common target genes. The dChip-GemiNI (Gene and miRNA Network-based Integration) method statistically ranks computationally predicted FFLs by their explanatory power to account for differential gene and miRNA expression between two biological conditions such as normal and cancer. GemiNI integrates not only gene and miRNA expression data but also computationally derived information about TF-target gene and miRNA-mRNA interactions. Literature validation shows that the integrated modeling of expression data and FFLs better identifies cancer-related TFs and miRNAs compared to existing approaches. We have utilized GemiNI for analyzing six data sets of solid cancers (liver, kidney, prostate, lung and germ cell) and found that top-ranked FFLs account for ∼20% of transcriptome changes between normal and cancer. We have identified common FFL regulators across multiple cancer types, such as known FFLs consisting of MYC and miR-15/miR-17 families, and novel FFLs consisting of ARNT, CREB1 and their miRNA partners. The results and analysis web server are available at http://www.canevolve.org/dChip-GemiNi.


Subject(s)
Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , MicroRNAs/metabolism , Transcription Factors/metabolism , Transcriptome , Feedback, Physiological , Humans , Neoplasms/genetics , Neoplasms/metabolism
13.
Brief Bioinform ; 13(3): 305-16, 2012 May.
Article in English | MEDLINE | ID: mdl-21949216

ABSTRACT

Over the last decade, multiple functional genomic datasets studying chromosomal aberrations and their downstream effects on gene expression have accumulated for several cancer types. A vast majority of them are in the form of paired gene expression profiles and somatic copy number alterations (CNA) information on the same patients identified using microarray platforms. In response, many algorithms and software packages are available for integrating these paired data. Surprisingly, there has been no serious attempt to review the currently available methodologies or the novel insights brought using them. In this work, we discuss the quantitative relationships observed between CNA and gene expression in multiple cancer types and biological milestones achieved using the available methodologies. We discuss the conceptual evolution of both, the step-wise and the joint data integration methodologies over the last decade. We conclude by providing suggestions for building efficient data integration methodologies and asking further biological questions.


Subject(s)
Algorithms , Neoplasms/genetics , Data Interpretation, Statistical , Gene Dosage , Gene Expression Profiling/methods , Genomics/methods , Oligonucleotide Array Sequence Analysis/methods
14.
BMC Bioinformatics ; 12: 251, 2011 Jun 21.
Article in English | MEDLINE | ID: mdl-21693021

ABSTRACT

BACKGROUND: Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies. RESULTS: Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on fly embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database. CONCLUSIONS: Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at https://sites.google.com/site/oracleclassifiers/.


Subject(s)
Antigens/chemistry , Antigens/immunology , Artificial Intelligence , Animals , Drosophila melanogaster , Epitopes, B-Lymphocyte/chemistry , Epitopes, B-Lymphocyte/immunology , Humans
15.
BMC Bioinformatics ; 12: 72, 2011 Mar 09.
Article in English | MEDLINE | ID: mdl-21388547

ABSTRACT

BACKGROUND: Genome-wide expression signatures are emerging as potential marker for overall survival and disease recurrence risk as evidenced by recent commercialization of gene expression based biomarkers in breast cancer. Similar predictions have recently been carried out using genome-wide copy number alterations and microRNAs. Existing software packages for microarray data analysis provide functions to define expression-based survival gene signatures. However, there is no software that can perform survival analysis using SNP array data or draw survival curves interactively for expression-based sample clusters. RESULTS: We have developed the survival analysis module in the dChip software that performs survival analysis across the genome for gene expression and copy number microarray data. Built on the current dChip software's microarray analysis functions such as chromosome display and clustering, the new survival functions include interactive exploring of Kaplan-Meier (K-M) plots using expression or copy number data, computing survival p-values from the log-rank test and Cox models, and using permutation to identify significant chromosome regions associated with survival. CONCLUSIONS: The dChip survival module provides user-friendly way to perform survival analysis and visualize the results in the context of genes and cytobands. It requires no coding expertise and only minimal learning curve for thousands of existing dChip users. The implementation in Visual C++ also enables fast computation. The software and demonstration data are freely available at http://dchip-surv.chenglilab.org.


Subject(s)
Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Software , Breast Neoplasms/genetics , Cluster Analysis , Humans , Kaplan-Meier Estimate , Polymorphism, Single Nucleotide , Proportional Hazards Models
16.
Nature ; 471(7339): 527-31, 2011 Mar 24.
Article in English | MEDLINE | ID: mdl-21430782

ABSTRACT

Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide has successfully identified specific subtypes of regulatory elements. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb response elements, chromatin states, transcription factor binding sites, RNA polymerase II regulation and insulator elements; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome on the basis of more than 300 chromatin immunoprecipitation data sets for eight chromatin features, five histone deacetylases and thirty-eight site-specific transcription factors at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and validated a subset of predictions for promoters, enhancers and insulators in vivo. We identified also nearly 2,000 genomic regions of dense transcription factor binding associated with chromatin activity and accessibility. We discovered hundreds of new transcription factor co-binding relationships and defined a transcription factor network with over 800 potential regulatory relationships.


Subject(s)
Drosophila melanogaster/genetics , Genome, Insect/genetics , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid/genetics , Animals , Chromatin/metabolism , Chromatin Assembly and Disassembly , Chromatin Immunoprecipitation , Enhancer Elements, Genetic/genetics , Histone Deacetylases/metabolism , Insulator Elements/genetics , Promoter Regions, Genetic/genetics , Reproducibility of Results , Silencer Elements, Transcriptional/genetics , Transcription Factors/metabolism
17.
PLoS Genet ; 6(1): e1000814, 2010 Jan 15.
Article in English | MEDLINE | ID: mdl-20084099

ABSTRACT

Insulators are DNA sequences that control the interactions among genomic regulatory elements and act as chromatin boundaries. A thorough understanding of their location and function is necessary to address the complexities of metazoan gene regulation. We studied by ChIP-chip the genome-wide binding sites of 6 insulator-associated proteins-dCTCF, CP190, BEAF-32, Su(Hw), Mod(mdg4), and GAF-to obtain the first comprehensive map of insulator elements in Drosophila embryos. We identify over 14,000 putative insulators, including all classically defined insulators. We find two major classes of insulators defined by dCTCF/CP190/BEAF-32 and Su(Hw), respectively. Distributional analyses of insulators revealed that particular sub-classes of insulator elements are excluded between cis-regulatory elements and their target promoters; divide differentially expressed, alternative, and divergent promoters; act as chromatin boundaries; are associated with chromosomal breakpoints among species; and are embedded within active chromatin domains. Together, these results provide a map demarcating the boundaries of gene regulatory units and a framework for understanding insulator function during the development and evolution of Drosophila.


Subject(s)
Drosophila/genetics , Genome, Insect , Insulator Elements , Animals , Chromosome Mapping , Drosophila/metabolism , Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Protein Binding
18.
Bioinformatics ; 25(22): 3001-4, 2009 Nov 15.
Article in English | MEDLINE | ID: mdl-19656951

ABSTRACT

MOTIVATION: The highly coordinated expression of thousands of genes in an organism is regulated by the concerted action of transcription factors, chromatin proteins and epigenetic mechanisms. High-throughput experimental data for genome wide in vivo protein-DNA interactions and epigenetic marks are becoming available from large projects, such as the model organism ENCyclopedia Of DNA Elements (modENCODE) and from individual labs. Dissemination and visualization of these datasets in an explorable form is an important challenge. RESULTS: To support research on Drosophila melanogaster transcription regulation and make the genome wide in vivo protein-DNA interactions data available to the scientific community as a whole, we have developed a system called Flynet. Currently, Flynet contains 101 datasets for 38 transcription factors and chromatin regulator proteins in different experimental conditions. These factors exhibit different types of binding profiles ranging from sharp localized peaks to broad binding regions. The protein-DNA interaction data in Flynet was obtained from the analysis of chromatin immunoprecipitation experiments on one color and two color genomic tiling arrays as well as chromatin immunoprecipitation followed by massively parallel sequencing. A web-based interface, integrated with an AJAX based genome browser, has been built for queries and presenting analysis results. Flynet also makes available the cis-regulatory modules reported in literature, known and de novo identified sequence motifs across the genome, and other resources to study gene regulation. AVAILABILITY: Flynet is available at https://www.cistrack.org/flynet/.


Subject(s)
Computational Biology/methods , Drosophila melanogaster/genetics , Gene Regulatory Networks/genetics , Genome , Software , Animals , Chromatin Immunoprecipitation , Drosophila Proteins/genetics , Transcription Factors/genetics
19.
Mol Syst Biol ; 4: 188, 2008.
Article in English | MEDLINE | ID: mdl-18414489

ABSTRACT

We demonstrate an integrated approach to the study of a transcriptional regulatory cascade involved in the progression of breast cancer and we identify a protein associated with disease progression. Using chromatin immunoprecipitation and genome tiling arrays, whole genome mapping of transcription factor-binding sites was combined with gene expression profiling to identify genes involved in the proliferative response to estrogen (E2). Using RNA interference, selected ERalpha and c-MYC gene targets were knocked down to identify mediators of E2-stimulated cell proliferation. Tissue microarray screening revealed that high expression of an epigenetic factor, the E2-inducible histone variant H2A.Z, is significantly associated with lymph node metastasis and decreased breast cancer survival. Detection of H2A.Z levels independently increased the prognostic power of biomarkers currently in clinical use. This integrated approach has accelerated the identification of a molecule linked to breast cancer progression, has implications for diagnostic and therapeutic interventions, and can be applied to a wide range of cancers.


Subject(s)
Breast Neoplasms/metabolism , Estrogens/metabolism , Histones/chemistry , Biomarkers, Tumor/metabolism , Chromatin/chemistry , Disease Progression , Epigenesis, Genetic , Estrogen Receptor alpha/metabolism , Genome , Humans , Lymphatic Metastasis , Models, Biological , Proto-Oncogene Proteins c-myc/metabolism , RNA Interference
20.
Gene ; 407(1-2): 199-215, 2008 Jan 15.
Article in English | MEDLINE | ID: mdl-17996400

ABSTRACT

Systematically annotating function of enzymes that belong to large protein families encoded in a single eukaryotic genome is a very challenging task. We carried out such an exercise to annotate function for serine-protease family of the trypsin fold in Drosophila melanogaster, with an emphasis on annotating serine-protease homologues (SPHs) that may have lost their catalytic function. Our approach involves data mining and data integration to provide function annotations for 190 Drosophila gene products containing serine-protease-like domains, of which 35 are SPHs. This was accomplished by analysis of structure-function relationships, gene-expression profiles, large-scale protein-protein interaction data, literature mining and bioinformatic tools. We introduce functional residue clustering (FRC), a method that performs hierarchical clustering of sequences using properties of functionally important residues and utilizes correlation co-efficient as a quantitative similarity measure to transfer in vivo substrate specificities to proteases. We show that the efficiency of transfer of substrate-specificity information using this method is generally high. FRC was also applied on Drosophila proteases to assign putative competitive inhibitor relationships (CIRs). Microarray gene-expression data were utilized to uncover a large-scale and dual involvement of proteases in development and in immune response. We found specific recruitment of SPHs and proteases with CLIP domains in immune response, suggesting evolution of a new function for SPHs. We also suggest existence of separate downstream protease cascades for immune response against bacterial/fungal infections and parasite/parasitoid infections. We verify quality of our annotations using information from RNAi screens and other evidence types. Utilization of such multi-fold approaches results in 10-fold increase of function annotation for Drosophila serine proteases and demonstrates value in increasing annotations in multiple genomes.


Subject(s)
Computational Biology/methods , Drosophila Proteins/metabolism , Drosophila melanogaster/enzymology , Sequence Analysis, Protein/methods , Serine Endopeptidases/metabolism , Animals , Cluster Analysis , Drosophila Proteins/chemistry , Drosophila Proteins/genetics , Drosophila melanogaster/embryology , Drosophila melanogaster/genetics , Embryonic Development/genetics , Gene Expression Profiling , Molecular Sequence Data , Protein Interaction Mapping , Protein Structure, Tertiary , Serine Endopeptidases/chemistry , Serine Endopeptidases/genetics , Serine Proteinase Inhibitors/pharmacology , Substrate Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...