Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
bioRxiv ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38826254

ABSTRACT

Background: Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of regulatory programs this variation affects can shed light on the apparatuses of human diseases. Results: We collected epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we constructed networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks served as the base for a rich series of analyses, through which we demonstrated their temporal dynamics and enrichment for various disease-associated variants. We applied the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrated methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Conclusions: Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes. This includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.

2.
Nucleic Acids Res ; 52(4): 1613-1627, 2024 Feb 28.
Article in English | MEDLINE | ID: mdl-38296821

ABSTRACT

The advent of perturbation-based massively parallel reporter assays (MPRAs) technique has facilitated the delineation of the roles of non-coding regulatory elements in orchestrating gene expression. However, computational efforts remain scant to evaluate and establish guidelines for sequence design strategies for perturbation MPRAs. In this study, we propose a framework for evaluating and comparing various perturbation strategies for MPRA experiments. Within this framework, we benchmark three different perturbation approaches from the perspectives of alteration in motif-based profiles, consistency of MPRA outputs, and robustness of models that predict the activities of putative regulatory motifs. While our analyses show very similar results across multiple benchmarking metrics, the predictive modeling for the approach involving random nucleotide shuffling shows significant robustness compared with the other two approaches. Thus, we recommend designing sequences by randomly shuffling the nucleotides of the perturbed site in perturbation-MPRA, followed by a coherence check to prevent the introduction of other variations of the target motifs. In summary, our evaluation framework and the benchmarking findings create a resource of computational pipelines and highlight the potential of perturbation-MPRA in predicting non-coding regulatory activities.


Subject(s)
Genetic Techniques , Regulatory Sequences, Nucleic Acid , Nucleotides
3.
bioRxiv ; 2023 Sep 29.
Article in English | MEDLINE | ID: mdl-37808807

ABSTRACT

The advent of the perturbation-based massively parallel reporter assays (MPRAs) technique has enabled delineating of the roles of non-coding regulatory elements in orchestrating gene expression. However, computational efforts remain scant to evaluate and establish guidelines for sequence design strategies for perturbation MPRAs. Here, we propose a framework for evaluating and comparing various perturbation strategies for MPRA experiments. Under this framework, we benchmark three different perturbation approaches from the perspectives of alteration in motif-based profiles, consistency of MPRA outputs, and robustness of models that predict the activities of putative regulatory motifs. Although our analyses show similar while significant results in multiple metrics, the method of randomly shuffling nucleotides outperform the other two methods. Thus, we still recommend designing sequences by randomly shuffling the nucleotides of the perturbed site in perturbation-MPRA. The evaluation framework, together with the benchmarking findings in our work, creates a resource of computational pipelines and illustrates the promise of perturbation-MPRA for predicting non-coding regulatory activities.

4.
Brief Bioinform ; 24(5)2023 09 20.
Article in English | MEDLINE | ID: mdl-37598422

ABSTRACT

The advent of single-cell RNA sequencing (scRNA-seq) technologies has enabled gene expression profiling at the single-cell resolution, thereby enabling the quantification and comparison of transcriptional variability among individual cells. Although alterations in transcriptional variability have been observed in various biological states, statistical methods for quantifying and testing differential variability between groups of cells are still lacking. To identify the best practices in differential variability analysis of single-cell gene expression data, we propose and compare 12 statistical pipelines using different combinations of methods for normalization, feature selection, dimensionality reduction and variability calculation. Using high-quality synthetic scRNA-seq datasets, we benchmarked the proposed pipelines and found that the most powerful and accurate pipeline performs simple library size normalization, retains all genes in analysis and uses denSNE-based distances to cluster medoids as the variability measure. By applying this pipeline to scRNA-seq datasets of COVID-19 and autism patients, we have identified cellular variability changes between patients with different severity status or between patients and healthy controls.


Subject(s)
COVID-19 , Humans , COVID-19/genetics , Gene Expression Profiling/methods , Gene Expression , Sequence Analysis, RNA/methods , Cluster Analysis
5.
Int J Mol Sci ; 24(4)2023 Feb 09.
Article in English | MEDLINE | ID: mdl-36834916

ABSTRACT

Autism spectrum disorder (ASD) is a common, complex, and highly heritable condition with contributions from both common and rare genetic variations. While disruptive, rare variants in protein-coding regions clearly contribute to symptoms, the role of rare non-coding remains unclear. Variants in these regions, including promoters, can alter downstream RNA and protein quantity; however, the functional impacts of specific variants observed in ASD cohorts remain largely uncharacterized. Here, we analyzed 3600 de novo mutations in promoter regions previously identified by whole-genome sequencing of autistic probands and neurotypical siblings to test the hypothesis that mutations in cases have a greater functional impact than those in controls. We leveraged massively parallel reporter assays (MPRAs) to detect transcriptional consequences of these variants in neural progenitor cells and identified 165 functionally high confidence de novo variants (HcDNVs). While these HcDNVs are enriched for markers of active transcription, disruption to transcription factor binding sites, and open chromatin, we did not identify differences in functional impact based on ASD diagnostic status.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Humans , Autism Spectrum Disorder/genetics , Genetic Predisposition to Disease , Mutation , Autistic Disorder/genetics , Promoter Regions, Genetic
6.
Nat Commun ; 13(1): 1504, 2022 03 21.
Article in English | MEDLINE | ID: mdl-35315433

ABSTRACT

Gene regulatory elements play a key role in orchestrating gene expression during cellular differentiation, but what determines their function over time remains largely unknown. Here, we perform perturbation-based massively parallel reporter assays at seven early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide cellular differentiation. By perturbing over 2,000 putative DNA binding motifs in active regulatory regions, we delineate four categories of functional elements, and observe that activity direction is mostly determined by the sequence itself, while the magnitude of effect depends on the cellular environment. We also find that fine-tuning transcription rates is often achieved by a combined activity of adjacent activating and repressing elements. Our work provides a blueprint for the sequence components needed to induce different transcriptional patterns in general and specifically during neural differentiation.


Subject(s)
Biological Assay , Regulatory Sequences, Nucleic Acid , Regulatory Sequences, Nucleic Acid/genetics
7.
Nature ; 598(7879): 205-213, 2021 10.
Article in English | MEDLINE | ID: mdl-34616060

ABSTRACT

During mammalian development, differences in chromatin state coincide with cellular differentiation and reflect changes in the gene regulatory landscape1. In the developing brain, cell fate specification and topographic identity are important for defining cell identity2 and confer selective vulnerabilities to neurodevelopmental disorders3. Here, to identify cell-type-specific chromatin accessibility patterns in the developing human brain, we used a single-cell assay for transposase accessibility by sequencing (scATAC-seq) in primary tissue samples from the human forebrain. We applied unbiased analyses to identify genomic loci that undergo extensive cell-type- and brain-region-specific changes in accessibility during neurogenesis, and an integrative analysis to predict cell-type-specific candidate regulatory elements. We found that cerebral organoids recapitulate most putative cell-type-specific enhancer accessibility patterns but lack many cell-type-specific open chromatin regions that are found in vivo. Systematic comparison of chromatin accessibility across brain regions revealed unexpected diversity among neural progenitor cells in the cerebral cortex and implicated retinoic acid signalling in the specification of neuronal lineage identity in the prefrontal cortex. Together, our results reveal the important contribution of chromatin state to the emerging patterns of cell type diversity and cell fate specification and provide a blueprint for evaluating the fidelity and robustness of cerebral organoids as a model for cortical development.


Subject(s)
Brain/cytology , Epigenomics , Neurogenesis , Single-Cell Analysis , Atlases as Topic , Brain/growth & development , Brain/metabolism , Chromatin/chemistry , Chromatin/genetics , Chromatin/metabolism , Disease Susceptibility , Enhancer Elements, Genetic , Humans , Neurons/cytology , Neurons/metabolism , Organoids/cytology , Tretinoin/metabolism
9.
Nat Protoc ; 15(8): 2387-2412, 2020 08.
Article in English | MEDLINE | ID: mdl-32641802

ABSTRACT

Massively parallel reporter assays (MPRAs) can simultaneously measure the function of thousands of candidate regulatory sequences (CRSs) in a quantitative manner. In this method, CRSs are cloned upstream of a minimal promoter and reporter gene, alongside a unique barcode, and introduced into cells. If the CRS is a functional regulatory element, it will lead to the transcription of the barcode sequence, which is measured via RNA sequencing and normalized for cellular integration via DNA sequencing of the barcode. This technology has been used to test thousands of sequences and their variants for regulatory activity, to decipher the regulatory code and its evolution, and to develop genetic switches. Lentivirus-based MPRA (lentiMPRA) produces 'in-genome' readouts and enables the use of this technique in hard-to-transfect cells. Here, we provide a detailed protocol for lentiMPRA, along with a user-friendly Nextflow-based computational pipeline-MPRAflow-for quantifying CRS activity from different MPRA designs. The lentiMPRA protocol takes ~2 months, which includes sequencing turnaround time and data processing with MPRAflow.


Subject(s)
Lentivirus/genetics , Regulatory Sequences, Nucleic Acid/genetics , Sequence Analysis, DNA/methods , Workflow , Base Sequence
10.
Cell Syst ; 11(1): 2-4, 2020 07 22.
Article in English | MEDLINE | ID: mdl-32702318

ABSTRACT

One snapshot of the peer review process for "Dissection of c-AMP Response Element Architecture by Using Genomic and Episomal Massively Parallel Reporter Assays" (Davis et al., 2020).


Subject(s)
Gene Expression Regulation , Response Elements , Adenosine Monophosphate , Genomics , Plasmids , Response Elements/genetics
11.
Cell Stem Cell ; 25(5): 713-727.e10, 2019 Nov 07.
Article in English | MEDLINE | ID: mdl-31631012

ABSTRACT

Epigenomic regulation and lineage-specific gene expression act in concert to drive cellular differentiation, but the temporal interplay between these processes is largely unknown. Using neural induction from human pluripotent stem cells (hPSCs) as a paradigm, we interrogated these dynamics by performing RNA sequencing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), and assay for transposase accessible chromatin using sequencing (ATAC-seq) at seven time points during early neural differentiation. We found that changes in DNA accessibility precede H3K27ac, which is followed by gene expression changes. Using massively parallel reporter assays (MPRAs) to test the activity of 2,464 candidate regulatory sequences at all seven time points, we show that many of these sequences have temporal activity patterns that correlate with their respective cell-endogenous gene expression and chromatin changes. A prioritization method incorporating all genomic and MPRA data further identified key transcription factors involved in driving neural fate. These results provide a comprehensive resource of genes and regulatory elements that orchestrate neural induction and illuminate temporal frameworks during differentiation.


Subject(s)
Chromatin/metabolism , Enhancer Elements, Genetic , Gene Expression Regulation, Developmental/genetics , Histones/metabolism , Human Embryonic Stem Cells/metabolism , Neurogenesis/genetics , Transcription Factors/metabolism , Acetylation , Chromatin Immunoprecipitation Sequencing , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Computational Biology , Histones/chemistry , Human Embryonic Stem Cells/drug effects , Humans , Mental Disorders/genetics , Mental Disorders/metabolism , Neurogenesis/drug effects , RNA-Seq , Transcription Factors/genetics
12.
Genome Biol ; 20(1): 183, 2019 09 02.
Article in English | MEDLINE | ID: mdl-31477158

ABSTRACT

Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods.


Subject(s)
Biological Assay , Genes, Reporter , High-Throughput Nucleotide Sequencing/methods , Software , Statistics as Topic , Alleles , Databases, Genetic , Gene Expression Profiling , Hep G2 Cells , Humans , K562 Cells
13.
Hum Mutat ; 40(9): 1299-1313, 2019 09.
Article in English | MEDLINE | ID: mdl-31131957

ABSTRACT

Deciphering the potential of noncoding loci to influence gene regulation has been the subject of intense research, with important implications in understanding genetic underpinnings of human diseases. Massively parallel reporter assays (MPRAs) can measure regulatory activity of thousands of DNA sequences and their variants in a single experiment. With increasing number of publically available MPRA data sets, one can now develop data-driven models which, given a DNA sequence, predict its regulatory activity. Here, we performed a comprehensive meta-analysis of several MPRA data sets in a variety of cellular contexts. We first applied an ensemble of methods to predict MPRA output in each context and observed that the most predictive features are consistent across data sets. We then demonstrate that predictive models trained in one cellular context can be used to predict MPRA output in another, with loss of accuracy attributed to cell-type-specific features. Finally, we show that our approach achieves top performance in the Fifth Critical Assessment of Genome Interpretation "Regulation Saturation" Challenge for predicting effects of single-nucleotide variants. Overall, our analysis provides insights into how MPRA data can be leveraged to highlight functional regulatory regions throughout the genome and can guide effective design of future experiments by better prioritizing regions of interest.


Subject(s)
Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods , Regulatory Sequences, Nucleic Acid , Genome, Human , Humans , Models, Genetic , Sequence Analysis, DNA/methods , Software
14.
Hum Mutat ; 40(9): 1280-1291, 2019 09.
Article in English | MEDLINE | ID: mdl-31106481

ABSTRACT

The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.


Subject(s)
DNA/chemistry , Epigenomics/methods , Point Mutation , Binding Sites , Cell Line , Chromatin/genetics , DNA/metabolism , Enhancer Elements, Genetic , Genetic Predisposition to Disease , Humans , Machine Learning , Promoter Regions, Genetic , Transcription Factors/metabolism
15.
Sci Rep ; 7(1): 7533, 2017 08 08.
Article in English | MEDLINE | ID: mdl-28790348

ABSTRACT

Standard cell culture guidelines often use media supplemented with antibiotics to prevent cell contamination. However, relatively little is known about the effect of antibiotic use in cell culture on gene expression and the extent to which this treatment could confound results. To comprehensively characterize the effect of antibiotic treatment on gene expression, we performed RNA-seq and ChIP-seq for H3K27ac on HepG2 cells, a human liver cell line commonly used for pharmacokinetic, metabolism and genomic studies, cultured in media supplemented with penicillin-streptomycin (PenStrep) or vehicle control. We identified 209 PenStrep-responsive genes, including transcription factors such as ATF3 that are likely to alter the regulation of other genes. Pathway analyses found a significant enrichment for "xenobiotic metabolism signaling" and "PXR/RXR activation" pathways. Our H3K27ac ChIP-seq identified 9,514 peaks that are PenStrep responsive. These peaks were enriched near genes that function in cell differentiation, tRNA modification, nuclease activity and protein dephosphorylation. Our results suggest that PenStrep treatment can significantly alter gene expression and regulation in a common liver cell type such as HepG2, advocating that antibiotic treatment should be taken into account when carrying out genetic, genomic or other biological assays in cultured cells.


Subject(s)
Anti-Bacterial Agents/pharmacology , Gene Expression Regulation, Neoplastic/drug effects , Genome, Human/genetics , Liver/drug effects , Acetylation/drug effects , Chromatin Immunoprecipitation/methods , Hep G2 Cells , Histones/metabolism , Humans , Liver/metabolism , Liver/pathology , Lysine/metabolism , Sequence Analysis, RNA/methods , Signal Transduction/drug effects , Signal Transduction/genetics
16.
Hum Mutat ; 38(9): 1240-1250, 2017 09.
Article in English | MEDLINE | ID: mdl-28220625

ABSTRACT

In many human diseases, associated genetic changes tend to occur within noncoding regions, whose effect might be related to transcriptional control. A central goal in human genetics is to understand the function of such noncoding regions: given a region that is statistically associated with changes in gene expression (expression quantitative trait locus [eQTL]), does it in fact play a regulatory role? And if so, how is this role "coded" in its sequence? These questions were the subject of the Critical Assessment of Genome Interpretation eQTL challenge. Participants were given a set of sequences that flank eQTLs in humans and were asked to predict whether these are capable of regulating transcription (as evaluated by massively parallel reporter assays), and whether this capability changes between alternative alleles. Here, we report lessons learned from this community effort. By inspecting predictive properties in isolation, and conducting meta-analysis over the competing methods, we find that using chromatin accessibility and transcription factor binding as features in an ensemble of classifiers or regression models leads to the most accurate results. We then characterize the loci that are harder to predict, putting the spotlight on areas of weakness, which we expect to be the subject of future studies.


Subject(s)
Computational Biology/methods , Gene Expression , Gene Expression Regulation , Genetic Predisposition to Disease , Humans , Quantitative Trait Loci
17.
Nat Med ; 22(6): 606-13, 2016 06.
Article in English | MEDLINE | ID: mdl-27183217

ABSTRACT

Human leukocyte antigen class I (HLA)-restricted CD8(+) T lymphocyte (CTL) responses are crucial to HIV-1 control. Although HIV can evade these responses, the longer-term impact of viral escape mutants remains unclear, as these variants can also reduce intrinsic viral fitness. To address this, we here developed a metric to determine the degree of HIV adaptation to an HLA profile. We demonstrate that transmission of viruses that are pre-adapted to the HLA molecules expressed in the recipient is associated with impaired immunogenicity, elevated viral load and accelerated CD4(+) T cell decline. Furthermore, the extent of pre-adaptation among circulating viruses explains much of the variation in outcomes attributed to the expression of certain HLA alleles. Thus, viral pre-adaptation exploits 'holes' in the immune response. Accounting for these holes may be key for vaccine strategies seeking to elicit functional responses from viral variants, and to HIV cure strategies that require broad CTL responses to achieve successful eradication of HIV reservoirs.


Subject(s)
Adaptation, Physiological/immunology , CD8-Positive T-Lymphocytes/immunology , HIV Infections/transmission , HIV-1/immunology , Histocompatibility Antigens Class I/immunology , Immune Evasion/immunology , AIDS Vaccines/immunology , Africa, Southern , British Columbia , CD4 Lymphocyte Count , Cohort Studies , Evolution, Molecular , HIV Infections/immunology , HIV-1/genetics , Humans , Immune Evasion/genetics , Immunity, Cellular/immunology , Linear Models , Models, Immunological , Proportional Hazards Models , Receptors, Antigen, T-Cell/immunology , Viral Load , Virus Replication/genetics
18.
BMC Bioinformatics ; 16: 164, 2015 May 17.
Article in English | MEDLINE | ID: mdl-25980407

ABSTRACT

BACKGROUND: Host-microbe and microbe-microbe interactions are often governed by the complex exchange of metabolites. Such interactions play a key role in determining the way pathogenic and commensal species impact their host and in the assembly of complex microbial communities. Recently, several studies have demonstrated how such interactions are reflected in the organization of the metabolic networks of the interacting species, and introduced various graph theory-based methods to predict host-microbe and microbe-microbe interactions directly from network topology. Using these methods, such studies have revealed evolutionary and ecological processes that shape species interactions and community assembly, highlighting the potential of this reverse-ecology research paradigm. RESULTS: NetCooperate is a web-based tool and a software package for determining host-microbe and microbe-microbe cooperative potential. It specifically calculates two previously developed and validated metrics for species interaction: the Biosynthetic Support Score which quantifies the ability of a host species to supply the nutritional requirements of a parasitic or a commensal species, and the Metabolic Complementarity Index which quantifies the complementarity of a pair of microbial organisms' niches. NetCooperate takes as input a pair of metabolic networks, and returns the pairwise metrics as well as a list of potential syntrophic metabolic compounds. CONCLUSIONS: The Biosynthetic Support Score and Metabolic Complementarity Index provide insight into host-microbe and microbe-microbe metabolic interactions. NetCooperate determines these interaction indices from metabolic network topology, and can be used for small- or large-scale analyses. NetCooperate is provided as both a web-based tool and an open-source Python module; both are freely available online at http://elbo.gs.washington.edu/software_netcooperate.html.


Subject(s)
Bacteria/metabolism , Computational Biology/methods , Host-Parasite Interactions , Metabolic Networks and Pathways , Microbial Interactions , Software , Animals , Bacteria/classification , Bacteria/genetics , Humans , Internet , Models, Biological
19.
PLoS Genet ; 10(9): e1004587, 2014 Sep.
Article in English | MEDLINE | ID: mdl-25210734

ABSTRACT

Associations between the level of single transcripts and single corresponding genetic variants, expression single nucleotide polymorphisms (eSNPs), have been extensively studied and reported. However, most expression traits are complex, involving the cooperative action of multiple SNPs at different loci affecting multiple genes. Finding these cooperating eSNPs by exhaustive search has proven to be statistically challenging. In this paper we utilized availability of sequencing data with transcriptional profiles in the same cohorts to identify two kinds of usual suspects: eSNPs that alter coding sequences or eSNPs within the span of transcription factors (TFs). We utilize a computational framework for considering triplets, each comprised of a SNP and two associated genes. We examine pairs of triplets with such cooperating source eSNPs that are both associated with the same pair of target genes. We characterize such quartets through their genomic, topological and functional properties. We establish that this regulatory structure of cooperating quartets is frequent in real data, but is rarely observed in permutations. eSNP sources are mostly located on different chromosomes and away from their targets. In the majority of quartets, SNPs affect the expression of the two gene targets independently of one another, suggesting a mutually independent rather than a directionally dependent effect. Furthermore, the directions in which the minor allele count of the SNP affects gene expression within quartets are consistent, so that the two source eSNPs either both have the same effect on the target genes or both affect one gene in the opposite direction to the other. Same-effect eSNPs are observed more often than expected by chance. Cooperating quartets reported here in a human system might correspond to bi-fans, a known network motif of four nodes previously described in model organisms. Overall, our analysis offers insights regarding the fine motif structure of human regulatory networks.


Subject(s)
Gene Regulatory Networks/genetics , Polymorphism, Single Nucleotide/genetics , Protein Structure, Tertiary/genetics , Alleles , Gene Expression/genetics , Gene Expression Profiling/methods , Humans , Transcription Factors/genetics , Transcription, Genetic/genetics
20.
Genome Biol ; 14(7): R71, 2013 Jul 11.
Article in English | MEDLINE | ID: mdl-23844908

ABSTRACT

BACKGROUND: In recent years many genetic variants (eSNPs) have been reported as associated with expression of transcripts in trans. However, the causal variants and regulatory mechanisms through which they act remain mostly unknown. In this paper we follow two kinds of usual suspects: SNPs that alter coding regions or transcription factors, identifiable by sequencing data with transcriptional profiles in the same cohort. We show these interpretable genomic regions are enriched for eSNP association signals, thereby naturally defining source-target gene pairs. We map these pairs onto a protein-protein interaction (PPI) network and study their topological properties. RESULTS: For exonic eSNP sources, we report source-target proximity and high target degree within the PPI network. These pairs are more likely to be co-expressed and the eSNPs tend to have a cis effect, modulating the expression of the source gene. In contrast, transcription factor source-target pairs are not observed to have such properties, but instead a transcription factor source tends to assemble into units of defined functional roles along with its gene targets, and to share with them the same functional cluster of the PPI network. CONCLUSIONS: Our results suggest two modes of trans regulation: transcription factor variation frequently acts via a modular regulation mechanism, with multiple targets that share a function with the transcription factor source. Notwithstanding, exon variation often acts by a local cis effect, delineating shorter paths of interacting proteins across functional clusters of the PPI network.


Subject(s)
Exons/genetics , Gene Expression Regulation , Polymorphism, Single Nucleotide/genetics , Transcription Factors/genetics , Cluster Analysis , Humans , Molecular Sequence Annotation , Protein Interaction Maps/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...