Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
2.
Cell Rep Med ; 3(10): 100763, 2022 10 18.
Article in English | MEDLINE | ID: mdl-36198307

ABSTRACT

Environmental and genetic factors cause defects in pancreatic islets driving type 2 diabetes (T2D) together with the progression of multi-tissue insulin resistance. Mass spectrometry proteomics on samples from five key metabolic tissues of a cross-sectional cohort of 43 multi-organ donors provides deep coverage of their proteomes. Enrichment analysis of Gene Ontology terms provides a tissue-specific map of altered biological processes across healthy, prediabetes (PD), and T2D subjects. We find widespread alterations in several relevant biological pathways, including increase in hemostasis in pancreatic islets of PD, increase in the complement cascade in liver and pancreatic islets of PD, and elevation in cholesterol biosynthesis in liver of T2D. Our findings point to inflammatory, immune, and vascular alterations in pancreatic islets in PD that are hypotheses to be tested for potential contributions to hormonal perturbations such as impaired insulin and increased glucagon production. This multi-tissue proteomic map suggests tissue-specific metabolic dysregulations in T2D.


Subject(s)
Diabetes Mellitus, Type 2 , Prediabetic State , Humans , Diabetes Mellitus, Type 2/diagnosis , Prediabetic State/diagnosis , Proteomics , Glucagon/metabolism , Proteome/metabolism , Cross-Sectional Studies , Insulin/genetics , Metabolic Networks and Pathways/genetics , Cholesterol
3.
Sci Rep ; 12(1): 7433, 2022 05 06.
Article in English | MEDLINE | ID: mdl-35523803

ABSTRACT

Transcriptomic analyses are commonly used to identify differentially expressed genes between patients and controls, or within individuals across disease courses. These methods, whilst effective, cannot encompass the combinatorial effects of genes driving disease. We applied rule-based machine learning (RBML) models and rule networks (RN) to an existing paediatric Systemic Lupus Erythematosus (SLE) blood expression dataset, with the goal of developing gene networks to separate low and high disease activity (DA1 and DA3). The resultant model had an 81% accuracy to distinguish between DA1 and DA3, with unsupervised hierarchical clustering revealing additional subgroups indicative of the immune axis involved or state of disease flare. These subgroups correlated with clinical variables, suggesting that the gene sets identified may further the understanding of gene networks that act in concert to drive disease progression. This included roles for genes (i) induced by interferons (IFI35 and OTOF), (ii) key to SLE cell types (KLRB1 encoding CD161), or (iii) with roles in autophagy and NF-κB pathway responses (CKAP4). As demonstrated here, RBML approaches have the potential to reveal novel gene patterns from within a heterogeneous disease, facilitating patient clinical and therapeutic stratification.


Subject(s)
Gene Expression Profiling , Lupus Erythematosus, Systemic , Child , Gene Expression , Gene Expression Profiling/methods , Gene Regulatory Networks , Humans , Machine Learning
4.
Cancers (Basel) ; 14(4)2022 Feb 17.
Article in English | MEDLINE | ID: mdl-35205761

ABSTRACT

Gliomas develop and grow in the brain and central nervous system. Examining glioma grading processes is valuable for improving therapeutic challenges. One of the most extensive repositories storing transcriptomics data for gliomas is The Cancer Genome Atlas (TCGA). However, such big cohorts should be processed with caution and evaluated thoroughly as they can contain batch and other effects. Furthermore, biological mechanisms of cancer contain interactions among biomarkers. Thus, we applied an interpretable machine learning approach to discover such relationships. This type of transparent learning provides not only good predictability, but also reveals co-predictive mechanisms among features. In this study, we corrected the strong and confounded batch effect in the TCGA glioma data. We further used the corrected datasets to perform comprehensive machine learning analysis applied on single-sample gene set enrichment scores using collections from the Molecular Signature Database. Furthermore, using rule-based classifiers, we displayed networks of co-enrichment related to glioma grades. Moreover, we validated our results using the external glioma cohorts. We believe that utilizing corrected glioma cohorts from TCGA may improve the application and validation of any future studies. Finally, the co-enrichment and survival analysis provided detailed explanations for glioma progression and consequently, it should support the targeted treatment.

6.
Metabolites ; 11(11)2021 Oct 28.
Article in English | MEDLINE | ID: mdl-34822401

ABSTRACT

Small-compound databases contain a large amount of information for metabolites and metabolic pathways. However, the plethora of such databases and the redundancy of their information lead to major issues with analysis and standardization. A lack of preventive establishment of means of data access at the infant stages of a project might lead to mislabelled compounds, reduced statistical power, and large delays in delivery of results. We developed MetaFetcheR, an open-source R package that links metabolite data from several small-compound databases, resolves inconsistencies, and covers a variety of use-cases of data fetching. We showed that the performance of MetaFetcheR was superior to existing approaches and databases by benchmarking the performance of the algorithm in three independent case studies based on two published datasets.

7.
OMICS ; 25(10): 652-659, 2021 10.
Article in English | MEDLINE | ID: mdl-34520261

ABSTRACT

Type 2 diabetes (T2D) is characterized by pathophysiological alterations in lipid metabolism. One strategy to understand the molecular mechanisms behind these abnormalities is to identify cis-regulatory elements (CREs) located in chromatin-accessible regions of the genome that regulate key genes. In this study we integrated assay for transposase-accessible chromatin followed by sequencing (ATAC-seq) data, widely used to decode chromatin accessibility, with multi-omics data and publicly available CRE databases to identify candidate CREs associated with T2D for further experimental validations. We performed high-sensitive ATAC-seq in nine human liver samples from normal and T2D donors, and identified a set of differentially accessible regions (DARs). We identified seven DARs including a candidate enhancer for the ACOT1 gene that regulates the balance of acyl-CoA and free fatty acids (FFAs) in the cytoplasm. The relevance of ACOT1 regulation in T2D was supported by the analysis of transcriptomics and proteomics data in liver tissue. Long-chain acyl-CoA thioesterases (ACOTs) are a group of enzymes that hydrolyze acyl-CoA esters to FFAs and coenzyme A. ACOTs have been associated with regulation of triglyceride levels, fatty acid oxidation, mitochondrial function, and insulin signaling, linking their regulation to the pathogenesis of T2D. Our strategy integrating chromatin accessibility with DNA binding and other types of omics provides novel insights on the role of genetic regulation in T2D and is extendable to other complex multifactorial diseases.


Subject(s)
Diabetes Mellitus, Type 2 , Lipid Metabolism , Chromatin/metabolism , Chromatin Immunoprecipitation Sequencing , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Humans , Lipid Metabolism/genetics , Liver/metabolism , Thiolester Hydrolases/genetics , Thiolester Hydrolases/metabolism
8.
Life Sci Alliance ; 4(7)2021 07.
Article in English | MEDLINE | ID: mdl-34099540

ABSTRACT

Recent studies suggested that dysregulated YY1 plays a pivotal role in many liver diseases. To obtain a detailed view of genes and pathways regulated by YY1 in the liver, we carried out RNA sequencing in HepG2 cells after YY1 knockdown. A rigid set of 2,081 differentially expressed genes was identified by comparing the YY1-knockdown samples (n = 8) with the control samples (n = 14). YY1 knockdown significantly decreased the expression of several key transcription factors and their coactivators in lipid metabolism. This is illustrated by YY1 regulating PPARA expression through binding to its promoter and enhancer regions. Our study further suggest that down-regulation of the key transcription factors together with YY1 knockdown significantly decreased the cooperation between YY1 and these transcription factors at various regulatory regions, which are important in regulating the expression of genes in hepatic lipid metabolism. This was supported by the finding that the expression of SCD and ELOVL6, encoding key enzymes in lipogenesis, were regulated by the cooperation between YY1 and PPARA/RXRA complex over their promoters.


Subject(s)
Lipid Metabolism/genetics , Liver/metabolism , YY1 Transcription Factor/metabolism , Base Sequence , Fatty Acid Elongases , Hep G2 Cells , Humans , Lipid Metabolism/physiology , PPAR alpha/genetics , Promoter Regions, Genetic/genetics , Retinoid X Receptor alpha , Stearoyl-CoA Desaturase , Transcription Factors/genetics , YY1 Transcription Factor/genetics , YY1 Transcription Factor/physiology
9.
Nat Commun ; 12(1): 3621, 2021 06 15.
Article in English | MEDLINE | ID: mdl-34131149

ABSTRACT

Chromatin structure and accessibility, and combinatorial binding of transcription factors to regulatory elements in genomic DNA control transcription. Genetic variations in genes encoding histones, epigenetics-related enzymes or modifiers affect chromatin structure/dynamics and result in alterations in gene expression contributing to cancer development or progression. Gliomas are brain tumors frequently associated with epigenetics-related gene deregulation. We perform whole-genome mapping of chromatin accessibility, histone modifications, DNA methylation patterns and transcriptome analysis simultaneously in multiple tumor samples to unravel epigenetic dysfunctions driving gliomagenesis. Based on the results of the integrative analysis of the acquired profiles, we create an atlas of active enhancers and promoters in benign and malignant gliomas. We explore these elements and intersect with Hi-C data to uncover molecular mechanisms instructing gene expression in gliomas.


Subject(s)
Chromatin , Glioma/genetics , Regulatory Sequences, Nucleic Acid , Binding Sites , Brain Neoplasms/genetics , Chromatin Immunoprecipitation , DNA/metabolism , DNA Methylation , DNA-Binding Proteins/metabolism , Enhancer of Zeste Homolog 2 Protein , Epigenesis, Genetic , Epigenomics , Forkhead Box Protein M1 , Gene Expression , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Glioblastoma , Histone Code , Histones , Humans , Promoter Regions, Genetic , Transcription Factors/metabolism
10.
Front Genet ; 12: 618277, 2021.
Article in English | MEDLINE | ID: mdl-33719335

ABSTRACT

Autism spectrum disorder (ASD) is a heterogeneous neuropsychiatric disorder with a complex genetic background. Analysis of altered molecular processes in ASD patients requires linear and nonlinear methods that provide interpretable solutions. Interpretable machine learning provides legible models that allow explaining biological mechanisms and support analysis of clinical subgroups. In this work, we investigated several case-control studies of gene expression measurements of ASD individuals. We constructed a rule-based learning model from three independent datasets that we further visualized as a nonlinear gene-gene co-predictive network. To find dissimilarities between ASD subtypes, we scrutinized a topological structure of the network and estimated a centrality distance. Our analysis revealed that autism is the most severe subtype of ASD, while pervasive developmental disorder-not otherwise specified and Asperger syndrome are closely related and milder ASD subtypes. Furthermore, we analyzed the most important ASD-related features that were described in terms of gene co-predictors. Among others, we found a strong co-predictive mechanism between EMC4 and TMEM30A, which may suggest a co-regulation between these genes. The present study demonstrates the potential of applying interpretable machine learning in bioinformatics analyses. Although the proposed methodology was designed for transcriptomics data, it can be applied to other omics disciplines.

11.
BMC Bioinformatics ; 22(1): 110, 2021 Mar 06.
Article in English | MEDLINE | ID: mdl-33676405

ABSTRACT

BACKGROUND: Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more important to understand how a prediction was obtained rather than knowing what prediction was made. To this end so-called interpretable machine learning has been recently advocated. In this study, we implemented an interpretable machine learning package based on the rough set theory. An important aim of our work was provision of statistical properties of the models and their components. RESULTS: We present the R.ROSETTA package, which is an R wrapper of ROSETTA framework. The original ROSETTA functions have been improved and adapted to the R programming environment. The package allows for building and analyzing non-linear interpretable machine learning models. R.ROSETTA gathers combinatorial statistics via rule-based modelling for accessible and transparent results, well-suited for adoption within the greater scientific community. The package also provides statistics and visualization tools that facilitate minimization of analysis bias and noise. The R.ROSETTA package is freely available at https://github.com/komorowskilab/R.ROSETTA . To illustrate the usage of the package, we applied it to a transcriptome dataset from an autism case-control study. Our tool provided hypotheses for potential co-predictive mechanisms among features that discerned phenotype classes. These co-predictors represented neurodevelopmental and autism-related genes. CONCLUSIONS: R.ROSETTA provides new insights for interpretable machine learning analyses and knowledge-based systems. We demonstrated that our package facilitated detection of dependencies for autism-related genes. Although the sample application of R.ROSETTA illustrates transcriptome data analysis, the package can be used to analyze any data organized in decision tables.


Subject(s)
Algorithms , Machine Learning , Case-Control Studies , Computational Biology , Data Mining
12.
Plant J ; 105(6): 1534-1548, 2021 03.
Article in English | MEDLINE | ID: mdl-33314374

ABSTRACT

Arabidopsis thaliana 45S ribosomal genes (rDNA) are located in tandem arrays called nucleolus organizing regions on the termini of chromosomes 2 and 4 (NOR2 and NOR4) and encode rRNA, a crucial structural element of the ribosome. The current model of rDNA organization suggests that inactive rRNA genes accumulate in the condensed chromocenters in the nucleus and at the nucleolar periphery, while the nucleolus delineates active genes. We challenge the perspective that all intranucleolar rDNA is active by showing that a subset of nucleolar rDNA assembles into condensed foci marked by H3.1 and H3.3 histones that also contain the repressive H3K9me2 histone mark. By using plant lines containing a low number of rDNA copies, we further found that the condensed foci relate to the folding of rDNA, which appears to be a common mechanism of rDNA regulation inside the nucleolus. The H3K9me2 histone mark found in condensed foci represents a typical modification of bulk inactive rDNA, as we show by genome-wide approaches, similar to the H2A.W histone variant. The euchromatin histone marks H3K27me3 and H3K4me3, in contrast, do not colocalize with nucleolar foci and their overall levels in the nucleolus are very low. We further demonstrate that the rDNA promoter is an important regulatory region of the rDNA, where the distribution of histone variants and histone modifications are modulated in response to rDNA activity.


Subject(s)
DNA, Plant/genetics , DNA, Ribosomal/genetics , Epigenesis, Genetic/genetics , Arabidopsis/genetics , Cell Nucleolus/genetics , Cell Nucleus/genetics , DNA, Plant/metabolism , DNA, Ribosomal/metabolism , Genetic Markers/genetics , Genetic Variation , Histones/genetics , Histones/metabolism , Plant Roots/metabolism , Transcription, Genetic
13.
Hepatol Res ; 51(2): 233-238, 2021 Feb.
Article in English | MEDLINE | ID: mdl-33119937

ABSTRACT

AIM: The aim of this study was to explore the benefits of data integration from different platforms for single nucleus transcriptomics profiling to characterize cell populations in human liver. METHODS: We generated single-nucleus RNA sequencing data from Chromium 10X Genomics and Drop-seq for a human liver sample. We utilized state of the art bioinformatics tools to undertake a rigorous quality control and to integrate the data into a common space summarizing the gene expression variation from the respective platforms, while accounting for known and unknown confounding factors. RESULTS: Analysis of single nuclei transcriptomes from both 10X and Drop-seq allowed identification of the major liver cell types, while the integrated set obtained enough statistical power to separate a small population of inactive hepatic stellate cells that was not characterized in either of the platforms. CONCLUSIONS: Integration of droplet-based single nucleus transcriptomics data enabled identification of a small cluster of inactive hepatic stellate cells that highlights the potential of our approach. We suggest single-nucleus RNA sequencing integrative approaches could be utilized to design larger and cost-effective studies.

14.
Sci Rep ; 10(1): 8343, 2020 05 20.
Article in English | MEDLINE | ID: mdl-32433479

ABSTRACT

Alteration of various metabolites has been linked to type 2 diabetes (T2D) and insulin resistance. However, identifying significant associations between metabolites and tissue-specific phenotypes requires a multi-omics approach. In a cohort of 42 subjects with different levels of glucose tolerance (normal, prediabetes and T2D) matched for age and body mass index, we calculated associations between parameters of whole-body positron emission tomography (PET)/magnetic resonance imaging (MRI) during hyperinsulinemic euglycemic clamp and non-targeted metabolomics profiling for subcutaneous adipose tissue (SAT) and plasma. Plasma metabolomics profiling revealed that hepatic fat content was positively associated with tyrosine, and negatively associated with lysoPC(P-16:0). Visceral adipose tissue (VAT) and SAT insulin sensitivity (Ki), were positively associated with several lysophospholipids, while the opposite applied to branched-chain amino acids. The adipose tissue metabolomics revealed a positive association between non-esterified fatty acids and, VAT and liver Ki. Bile acids and carnitines in adipose tissue were inversely associated with VAT Ki. Furthermore, we detected several metabolites that were significantly higher in T2D than normal/prediabetes. In this study we present novel associations between several metabolites from SAT and plasma with the fat fraction, volume and insulin sensitivity of various tissues throughout the body, demonstrating the benefit of an integrative multi-omics approach.


Subject(s)
Diabetes Mellitus, Type 2/diagnosis , Insulin Resistance , Insulin/metabolism , Prediabetic State/diagnosis , Whole Body Imaging/methods , Aged , Amino Acids, Branched-Chain/metabolism , Case-Control Studies , Diabetes Mellitus, Type 2/blood , Diabetes Mellitus, Type 2/metabolism , Female , Fluorodeoxyglucose F18/administration & dosage , Fluorodeoxyglucose F18/metabolism , Glucose Clamp Technique , Humans , Intra-Abdominal Fat/metabolism , Lipid Metabolism , Liver/metabolism , Lysophospholipids/metabolism , Magnetic Resonance Imaging/methods , Male , Metabolomics , Middle Aged , Multimodal Imaging/methods , Positron-Emission Tomography/methods , Prediabetic State/blood , Prediabetic State/metabolism , Subcutaneous Fat/metabolism
15.
OMICS ; 24(4): 180-194, 2020 04.
Article in English | MEDLINE | ID: mdl-32181701

ABSTRACT

The liver is the largest solid organ and a primary metabolic hub. In recent years, intact cell nuclei were used to perform single-nuclei RNA-seq (snRNA-seq) for tissues difficult to dissociate and for flash-frozen archived tissue samples to discover unknown and rare cell subpopulations. In this study, we performed snRNA-seq of a liver sample to identify subpopulations of cells based on nuclear transcriptomics. In 4282 single nuclei, we detected, on average, 1377 active genes and we identified seven major cell types. We integrated data from 94,286 distal interactions (p < 0.05) for 7682 promoters from a targeted chromosome conformation capture technique (HiCap) and mass spectrometry proteomics for the same liver sample. We observed a reasonable correlation between proteomics and in silico bulk snRNA-seq (r = 0.47) using tissue-independent gene-specific protein abundancy estimation factors. We specifically looked at genes of medical importance. The DPYD gene is involved in the pharmacogenetics of fluoropyrimidine toxicity and some of its variants are analyzed for clinical purposes. We identified a new putative polymorphic regulatory element, which may contribute to variation in toxicity. Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and we investigated all known risk genes. We identified a complex regulatory landscape for the SLC2A2 gene with 16 candidate enhancers. Three of them harbor somatic motif breaking and other mutations in HCC in the Pan Cancer Analysis of Whole Genomes dataset and are candidates to contribute to malignancy. Our results highlight the potential of a multi-omics approach in the study of human diseases.


Subject(s)
Carcinoma, Hepatocellular/genetics , Cell Nucleus/genetics , Computational Biology/methods , Liver Neoplasms/genetics , Liver/metabolism , Transcriptome , B-Lymphocytes/cytology , B-Lymphocytes/metabolism , Carcinoma, Hepatocellular/metabolism , Carcinoma, Hepatocellular/pathology , Cell Nucleus/metabolism , Endothelial Cells/cytology , Endothelial Cells/metabolism , Gene Expression Regulation , Hepatic Stellate Cells/cytology , Hepatic Stellate Cells/metabolism , Hepatocytes/cytology , Hepatocytes/metabolism , High-Throughput Nucleotide Sequencing , Humans , Killer Cells, Natural/cytology , Killer Cells, Natural/metabolism , Kupffer Cells/cytology , Kupffer Cells/metabolism , Liver/cytology , Liver Neoplasms/metabolism , Liver Neoplasms/pathology , Single-Cell Analysis/methods , T-Lymphocytes/cytology , T-Lymphocytes/metabolism
16.
Nature ; 578(7793): 102-111, 2020 02.
Article in English | MEDLINE | ID: mdl-32025015

ABSTRACT

The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.


Subject(s)
Genome, Human/genetics , Mutation/genetics , Neoplasms/genetics , DNA Breaks , Databases, Genetic , Gene Expression Regulation, Neoplastic , Genome-Wide Association Study , Humans , INDEL Mutation
18.
Sci Rep ; 9(1): 9653, 2019 07 04.
Article in English | MEDLINE | ID: mdl-31273253

ABSTRACT

Type 2 diabetes (T2D) mellitus is a complex metabolic disease commonly caused by insulin resistance in several tissues. We performed a matched two-dimensional metabolic screening in tissue samples from 43 multi-organ donors. The intra-individual analysis was assessed across five key metabolic tissues (serum, visceral adipose tissue, liver, pancreatic islets and skeletal muscle), and the inter-individual across three different groups reflecting T2D progression. We identified 92 metabolites differing significantly between non-diabetes and T2D subjects. In diabetes cases, carnitines were significantly higher in liver, while lysophosphatidylcholines were significantly lower in muscle and serum. We tracked the primary tissue of origin for multiple metabolites whose alterations were reflected in serum. An investigation of three major stages spanning from controls, to pre-diabetes and to overt T2D indicated that a subset of lysophosphatidylcholines was significantly lower in the muscle of pre-diabetes subjects. Moreover, glycodeoxycholic acid was significantly higher in liver of pre-diabetes subjects while additional increase in T2D was insignificant. We confirmed many previously reported findings and substantially expanded on them with altered markers for early and overt T2D. Overall, the analysis of this unique dataset can increase the understanding of the metabolic interplay between organs in the development of T2D.


Subject(s)
Biomarkers/metabolism , Carnitine/metabolism , Diabetes Mellitus, Type 2/metabolism , Lysophosphatidylcholines/metabolism , Metabolome , Prediabetic State/metabolism , Aged , Biomarkers/analysis , Case-Control Studies , Diabetes Mellitus, Type 2/pathology , Female , Humans , Insulin Resistance , Intra-Abdominal Fat/metabolism , Intra-Abdominal Fat/pathology , Liver/metabolism , Liver/pathology , Male , Metabolomics , Middle Aged , Muscle, Skeletal/metabolism , Muscle, Skeletal/pathology , Prediabetic State/pathology , Signal Transduction
19.
Sci Rep ; 8(1): 4390, 2018 03 13.
Article in English | MEDLINE | ID: mdl-29535343

ABSTRACT

In order to find clinically useful prognostic markers for glioma patients' survival, we employed Monte Carlo Feature Selection and Interdependencies Discovery (MCFS-ID) algorithm on DNA methylation (HumanMethylation450 platform) and RNA-seq datasets from The Cancer Genome Atlas (TCGA) for 88 patients observed until death. The input features were ranked according to their importance in predicting patients' longer (400+ days) or shorter (≤400 days) survival without prior classification of the patients. Interestingly, out of the 65 most important features found, 63 are methylation sites, and only two mRNAs. Moreover, 61 out of the 63 methylation sites are among those detected by the 450 k array technology, while being absent in the HumanMethylation27. The most important methylation feature (cg15072976) overlaps with the RE1 Silencing Transcription Factor (REST) binding site, and was confirmed to intersect with the REST binding motif in human U87 glioma cells. Six additional methylation sites from the top 63 overlap with REST sites. We found that the methylation status of the cg15072976 site affects transcription factor binding in U87 cells in gel shift assay. The cg15072976 methylation status discriminates ≤400 and 400+ patients in an independent dataset from TCGA and shows positive association with survival time as evidenced by Kaplan-Meier plots.


Subject(s)
DNA Methylation , Epigenesis, Genetic , Glioma/genetics , Glioma/mortality , Transcriptome , Computational Biology/methods , CpG Islands , DNA/chemistry , DNA/genetics , DNA/metabolism , Gene Expression Profiling , Glioma/pathology , Humans , Kaplan-Meier Estimate , Molecular Conformation , Molecular Sequence Annotation , Monte Carlo Method , Mutation , Neoplasm Grading , Neoplasm Staging , Prognosis , Promoter Regions, Genetic , Structure-Activity Relationship
20.
Nucleic Acids Res ; 44(19): 9110-9120, 2016 Nov 02.
Article in English | MEDLINE | ID: mdl-27625394

ABSTRACT

Gene transcription is regulated mainly by transcription factors (TFs). ENCODE and Roadmap Epigenomics provide global binding profiles of TFs, which can be used to identify regulatory regions. To this end we implemented a method to systematically construct cell-type and species-specific maps of regulatory regions and TF-TF interactions. We illustrated the approach by developing maps for five human cell-lines and two other species. We detected ∼144k putative regulatory regions among the human cell-lines, with the majority of them being ∼300 bp. We found ∼20k putative regulatory elements in the ENCODE heterochromatic domains suggesting a large regulatory potential in the regions presumed transcriptionally silent. Among the most significant TF interactions identified in the heterochromatic regions were CTCF and the cohesin complex, which is in agreement with previous reports. Finally, we investigated the enrichment of the obtained putative regulatory regions in the 3D chromatin domains. More than 90% of the regions were discovered in the 3D contacting domains. We found a significant enrichment of GWAS SNPs in the putative regulatory regions. These significant enrichments provide evidence that the regulatory regions play a crucial role in the genomic structural stability. Additionally, we generated maps of putative regulatory regions for prostate and colorectal cancer human cell-lines.


Subject(s)
Genomics , Regulatory Sequences, Nucleic Acid , Binding Sites , Cell Line , Chromatin/genetics , Chromatin/metabolism , Chromatin Immunoprecipitation , Chromosome Mapping , Computational Biology/methods , Genome, Human , Genome-Wide Association Study , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Polymorphism, Single Nucleotide , Protein Binding , Protein Interaction Mapping , Protein Interaction Maps , Transcription Factors/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...