RESUMO
Glioblastoma Multiforme is a brain tumor distinguished by its aggressiveness. We suggested that this aggressiveness leads single-cell RNA-sequence data (scRNA-seq) to span a representative portion of the cancer attractors domain. This conjecture allowed us to interpret the scRNA-seq heterogeneity as reflecting a representative trajectory within the attractor's domain. We considered factors such as genomic instability to characterize the cancer dynamics through stochastic fixed points. The fixed points were derived from centroids obtained through various clustering methods to verify our method sensitivity. This methodological foundation is based upon sample and time average equivalence, assigning an interpretative value to the data cluster centroids and supporting parameters estimation. We used stochastic simulations to reproduce the dynamics, and our results showed an alignment between experimental and simulated dataset centroids. We also computed the Waddington landscape, which provided a visual framework for validating the centroids and standard deviations as characterizations of cancer attractors. Additionally, we examined the stability and transitions between attractors and revealed a potential interplay between subtypes. These transitions might be related to cancer recurrence and progression, connecting the molecular mechanisms of cancer heterogeneity with statistical properties of gene expression dynamics. Our work advances the modeling of gene expression dynamics and paves the way for personalized therapeutic interventions.
Assuntos
Neoplasias Encefálicas , Glioblastoma , Análise de Célula Única , Glioblastoma/genética , Glioblastoma/patologia , Glioblastoma/metabolismo , Humanos , Análise de Célula Única/métodos , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Neoplasias Encefálicas/metabolismo , Regulação Neoplásica da Expressão Gênica , Heterogeneidade Genética , Perfilação da Expressão Gênica/métodos , Instabilidade Genômica , Análise de Sequência de RNA/métodos , Análise por ConglomeradosRESUMO
Alternative polyadenylation (APA) increases transcript diversity through the generation of isoforms with varying 3' untranslated region (3' UTR) lengths. As the 3' UTR harbors regulatory element target sites, such as miRNAs or RNA-binding proteins, changes in this region can impact post-transcriptional regulation and translation. Moreover, the APA landscape can change based on the cell type, cell state, or condition. Given that APA events can impact protein expression, investigating translational control is crucial for comprehending the overall cellular regulation process. Revisiting data from polysome profiling followed by RNA sequencing, we investigated the cardiomyogenic differentiation of pluripotent stem cells by identifying the transcripts that show dynamic 3' UTR lengthening or shortening, which are being actively recruited to ribosome complexes. Our findings indicate that dynamic 3' UTR lengthening is not exclusively associated with differential expression during cardiomyogenesis but rather with recruitment to polysomes. We confirm that the differentiated state of cardiomyocytes shows a preference for shorter 3' UTR in comparison to the pluripotent stage although preferences vary during the days of the differentiation process. The most distinct regulatory changes are seen in day 4 of differentiation, which is the mesoderm commitment time point of cardiomyogenesis. After identifying the miRNAs that would target specifically the alternative 3' UTR region of the isoforms, we constructed a gene regulatory network for the cardiomyogenesis process, in which genes related to the cell cycle were identified. Altogether, our work sheds light on the regulation and dynamic 3' UTR changes of polysome-recruited transcripts that take place during the cardiomyogenic differentiation of pluripotent stem cells.
RESUMO
Studying gene regulatory networks associated with cancer provides valuable insights for therapeutic purposes, given that cancer is fundamentally a genetic disease. However, as the number of genes in the system increases, the complexity arising from the interconnections between network components grows exponentially. In this study, using Boolean logic to adjust the existing relationships between network components has facilitated simplifying the modeling process, enabling the generation of attractors that represent cell phenotypes based on breast cancer RNA-seq data. A key therapeutic objective is to guide cells, through targeted interventions, to transition from the current cancer attractor to a physiologically distinct attractor unrelated to cancer. To achieve this, we developed a computational method that identifies network nodes whose inhibition can facilitate the desired transition from one tumor attractor to another associated with apoptosis, leveraging transcriptomic data from cell lines. To validate the model, we utilized previously published in vitro experiments where the downregulation of specific proteins resulted in cell growth arrest and death of a breast cancer cell line. The method proposed in this manuscript combines diverse data sources, conducts structural network analysis, and incorporates relevant biological knowledge on apoptosis in cancer cells. This comprehensive approach aims to identify potential targets of significance for personalized medicine.
Assuntos
Neoplasias da Mama , Modelos Genéticos , Humanos , Feminino , Neoplasias da Mama/genética , Algoritmos , Redes Reguladoras de Genes , Células MCF-7 , Modelos BiológicosRESUMO
Sulfur (S) is an essential macronutrient for plants and its availability in soils is an important determinant for growth and development. Current regulatory policies aimed at reducing industrial S emissions together with changes in agronomical practices have led to a decline in S contents in soils worldwide. Deficiency of sulfate-the primary form of S accessible to plants in soil-has adverse effects on both crop yield and nutritional quality. Hence, recent research has increasingly focused on unraveling the molecular mechanisms through which plants detect and adapt to a limiting supply of sulfate. A significant part of these studies involves the use of omics technologies and has generated comprehensive catalogs of sulfate deficiency-responsive genes and processes, principally in Arabidopsis together with a few studies centering on crop species such as wheat, rice, or members of the Brassica genus. Although we know that sulfate deficiency elicits an important reprogramming of the transcriptome, the transcriptional regulators orchestrating this response are not yet well understood. In this review, we summarize our current knowledge of gene expression responses to sulfate deficiency and recent efforts towards the identification of the transcription factors that are involved in controlling these responses. We further compare the transcriptional response and putative regulators between Arabidopsis and two important crop species, rice and tomato, to gain insights into common mechanisms of the response to sulfate deficiency.
Assuntos
Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Sulfatos , Sulfatos/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Arabidopsis/crescimento & desenvolvimento , Arabidopsis/fisiologia , Oryza/genética , Oryza/metabolismo , Oryza/crescimento & desenvolvimentoRESUMO
Introduction: Pseudomonas aeruginosa infections are one of the leading causes of death in immunocompromised patients with cystic fibrosis, diabetes, and lung diseases such as pneumonia and bronchiectasis. Furthermore, P. aeruginosa is one of the main multidrug-resistant bacteria responsible for nosocomial infections worldwide, including the multidrug-resistant CCBH4851 strain isolated in Brazil. Methods: One way to analyze their dynamic cellular behavior is through computational modeling of the gene regulatory network, which represents interactions between regulatory genes and their targets. For this purpose, Boolean models are important predictive tools to analyze these interactions. They are one of the most commonly used methods for studying complex dynamic behavior in biological systems. Results and discussion: Therefore, this research consists of building a Boolean model of the gene regulatory network of P. aeruginosa CCBH4851 using data from RNA-seq experiments. Next, the basins of attraction are estimated, as these regions and the transitions between them can help identify the attractors, representing long-term behavior in the Boolean model. The essential genes of the basins were associated with the phenotypes of the bacteria for two conditions: biofilm formation and polymyxin B treatment. Overall, the Boolean model and the analysis method proposed in this work can identify promising control actions and indicate potential therapeutic targets, which can help pinpoint new drugs and intervention strategies.
RESUMO
Gene regulatory networks are graph models representing cellular transcription events. Networks are far from complete due to time and resource consumption for experimental validation and curation of the interactions. Previous assessments have shown the modest performance of the available network inference methods based on gene expression data. Here, we study several caveats on the inference of regulatory networks and methods assessment through the quality of the input data and gold standard, and the assessment approach with a focus on the global structure of the network. We used synthetic and biological data for the predictions and experimentally-validated biological networks as the gold standard (ground truth). Standard performance metrics and graph structural properties suggest that methods inferring co-expression networks should no longer be assessed equally with those inferring regulatory interactions. While methods inferring regulatory interactions perform better in global regulatory network inference than co-expression-based methods, the latter is better suited to infer function-specific regulons and co-regulation networks. When merging expression data, the size increase should outweigh the noise inclusion and graph structure should be considered when integrating the inferences. We conclude with guidelines to take advantage of inference methods and their assessment based on the applications and available expression datasets.
RESUMO
Context: Inferring gene regulatory networks (GRN) from high-throughput gene expression data is a challenging task for which different strategies have been developed. Nevertheless, no ever-winning method exists, and each method has its advantages, intrinsic biases, and application domains. Thus, in order to analyze a dataset, users should be able to test different techniques and choose the most appropriate one. This step can be particularly difficult and time consuming, since most methods' implementations are made available independently, possibly in different programming languages. The implementation of an open-source library containing different inference methods within a common framework is expected to be a valuable toolkit for the systems biology community. Results: In this work, we introduce GReNaDIne (Gene Regulatory Network Data-driven Inference), a Python package that implements 18 machine learning data-driven gene regulatory network inference methods. It also includes eight generalist preprocessing techniques, suitable for both RNA-seq and microarray dataset analysis, as well as four normalization techniques dedicated to RNA-seq. In addition, this package implements the possibility to combine the results of different inference tools to form robust and efficient ensembles. This package has been successfully assessed under the DREAM5 challenge benchmark dataset. The open-source GReNaDIne Python package is made freely available in a dedicated GitLab repository, as well as in the official third-party software repository PyPI Python Package Index. The latest documentation on the GReNaDIne library is also available at Read the Docs, an open-source software documentation hosting platform. Contribution: The GReNaDIne tool represents a technological contribution to the field of systems biology. This package can be used to infer gene regulatory networks from high-throughput gene expression data using different algorithms within the same framework. In order to analyze their datasets, users can apply a battery of preprocessing and postprocessing tools and choose the most adapted inference method from the GReNaDIne library and even combine the output of different methods to obtain more robust results. The results format provided by GReNaDIne is compatible with well-known complementary refinement tools such as PYSCENIC.
Assuntos
Biologia Computacional , Redes Reguladoras de Genes , Biologia Computacional/métodos , São Vicente e Granadinas , Software , Expressão GênicaRESUMO
The transcriptomic analysis of microarray and RNA-Seq datasets followed our own bioinformatic pipeline to identify a transcriptional regulatory network of lung cancer. Twenty-six transcription factors are dysregulated and co-expressed in most of the lung cancer and pulmonary arterial hypertension datasets, which makes them the most frequently dysregulated transcription factors. Co-expression, gene regulatory, coregulatory, and transcriptional regulatory networks, along with fibration symmetries, were constructed to identify common connection patterns, alignments, main regulators, and target genes in order to analyze transcription factor complex formation, as well as its synchronized co-expression patterns in every type of lung cancer. The regulatory function of the most frequently dysregulated transcription factors over lung cancer deregulated genes was validated with ChEA3 enrichment analysis. A Kaplan-Meier plotter analysis linked the dysregulation of the top transcription factors with lung cancer patients' survival. Our results indicate that lung cancer has unique and common deregulated genes and transcription factors with pulmonary arterial hypertension, co-expressed and regulated in a coordinated and cooperative manner by the transcriptional regulatory network that might be associated with critical biological processes and signaling pathways related to the acquisition of the hallmarks of cancer, making them potentially relevant tumor biomarkers for lung cancer early diagnosis and targets for the development of personalized therapies against lung cancer.
RESUMO
Piscirickettsia salmonis is the most important health problem facing Chilean Aquaculture. Previous reports suggest that P. salmonis can survive in salmonid macrophages by interfering with the host immune response. However, the relevant aspects of the molecular pathogenesis of P. salmonis have been poorly characterized. In this work, we evaluated the transcriptomic changes in macrophage-like cell line SHK-1 infected with P. salmonis at 24- and 48-hours post-infection (hpi) and generated network models of the macrophage response to the infection using co-expression analysis and regulatory transcription factor-target gene information. Transcriptomic analysis showed that 635 genes were differentially expressed after 24- and/or 48-hpi. The pattern of expression of these genes was analyzed by weighted co-expression network analysis (WGCNA), which classified genes into 4 modules of expression, comprising early responses to the bacterium. Induced genes included genes involved in metabolism and cell differentiation, intracellular transportation, and cytoskeleton reorganization, while repressed genes included genes involved in extracellular matrix organization and RNA metabolism. To understand how these expression changes are orchestrated and to pinpoint relevant transcription factors (TFs) controlling the response, we established a curated database of TF-target gene regulatory interactions in Salmo salar, SalSaDB. Using this resource, together with co-expression module data, we generated infection context-specific networks that were analyzed to determine highly connected TF nodes. We found that the most connected TF of the 24- and 48-hpi response networks is KLF17, an ortholog of the KLF4 TF involved in the polarization of macrophages to an M2-phenotype in mammals. Interestingly, while KLF17 is induced by P. salmonis infection, other TFs, such as NOTCH3 and NFATC1, whose orthologs in mammals are related to M1-like macrophages, are repressed. In sum, our results suggest the induction of early regulatory events associated with an M2-like phenotype of macrophages that drives effectors related to the lysosome, RNA metabolism, cytoskeleton organization, and extracellular matrix remodeling. Moreover, the M1-like response seems delayed in generating an effective response, suggesting a polarization towards M2-like macrophages that allows the survival of P. salmonis. This work also contributes to SalSaDB, a curated database of TF-target gene interactions that is freely available for the Atlantic salmon community.
Assuntos
Salmo salar , Animais , Salmo salar/genética , Perfilação da Expressão Gênica , Macrófagos/metabolismo , Fatores de Transcrição/metabolismo , RNA/metabolismo , MamíferosRESUMO
Cyclic attractors generated from Boolean models may explain the adaptability of a cell in response to a dynamical complex tumor microenvironment. In contrast to this idea, we postulate that cyclic attractors in certain cases could be a systemic mechanism to face the perturbations coming from the environment. To justify our conjecture, we present a dynamic analysis of a highly curated transcriptional regulatory network of macrophages constrained into a cancer microenvironment. We observed that when M1-associated transcription factors (STAT1 or NF-κB) are perturbed and the microenvironment balances to a hyper-inflammation condition, cycle attractors activate genes whose signals counteract this effect implicated in tissue damage. The same behavior happens when the M2-associated transcription factors are disturbed (STAT3 or STAT6); cycle attractors will prevent a hyper-regulation scenario implicated in providing a suitable environment for tumor growth. Therefore, here we propose that cyclic macrophage phenotypes can serve as a reservoir for balancing the phenotypes when a specific phenotype-based transcription factor is perturbed in the regulatory network of macrophages. We consider that cyclic attractors should not be simply ignored, but it is necessary to carefully evaluate their biological importance. In this work, we suggest one conjecture: the cyclic attractors can serve as a reservoir to balance the inflammatory/regulatory response of the network under external perturbations.
Assuntos
Algoritmos , Microambiente Tumoral , Redes Reguladoras de Genes , Macrófagos , Fatores de Transcrição/genéticaRESUMO
The use of a new bioinformatics pipeline allowed the identification of deregulated transcription factors (TFs) coexpressed in lung cancer that could become biomarkers of tumor establishment and progression. A gene regulatory network (GRN) of lung cancer was created with the normalized gene expression levels of differentially expressed genes (DEGs) from the microarray dataset GSE19804. Moreover, coregulatory and transcriptional regulatory network (TRN) analyses were performed for the main regulators identified in the GRN analysis. The gene targets and binding motifs of all potentially implicated regulators were identified in the TRN and with multiple alignments of the TFs' target gene sequences. Six transcription factors (E2F3, FHL2, ETS1, KAT6B, TWIST1, and RUNX2) were identified in the GRN as essential regulators of gene expression in non-small-cell lung cancer (NSCLC) and related to the lung tumoral process. Our findings indicate that RUNX2 could be an important regulator of the lung cancer GRN through the formation of coregulatory complexes with other TFs related to the establishment and progression of lung cancer. Therefore, RUNX2 could become an essential biomarker for developing diagnostic tools and specific treatments against tumoral diseases in the lung after the experimental validation of its regulatory function.
RESUMO
BACKGROUND: Research on gene duplication is abundant and comes from a wide range of approaches, from high-throughput analyses and experimental evolution to bioinformatics and theoretical models. Notwithstanding, a consensus is still lacking regarding evolutionary mechanisms involved in evolution through gene duplication as well as the conditions that affect them. We argue that a better understanding of evolution through gene duplication requires considering explicitly that genes do not act in isolation. It demands studying how the perturbation that gene duplication implies percolates through the web of gene interactions. Due to evolution's contingent nature, the paths that lead to the final fate of duplicates must depend strongly on the early stages of gene duplication, before gene copies have accumulated distinctive changes. METHODS: Here we use a widely-known model of gene regulatory networks to study how gene duplication affects network behavior in early stages. Such networks comprise sets of genes that cross-regulate. They organize gene activity creating the gene expression patterns that give cells their phenotypic properties. We focus on how duplication affects two evolutionarily relevant properties of gene regulatory networks: mitigation of the effect of new mutations and access to new phenotypic variants through mutation. RESULTS: Among other observations, we find that those networks that are better at maintaining the original phenotype after duplication are usually also better at buffering the effect of single interaction mutations and that duplication tends to enhance further this ability. Moreover, the effect of mutations after duplication depends on both the kind of mutation and genes involved in it. We also found that those phenotypes that had easier access through mutation before duplication had higher chances of remaining accessible through new mutations after duplication. CONCLUSION: Our results support that gene duplication often mitigates the impact of new mutations and that this effect is not merely due to changes in the number of genes. The work that we put forward helps to identify conditions under which gene duplication may enhance evolvability and robustness to mutations.
Assuntos
Duplicação Gênica , Redes Reguladoras de Genes , Mutação , Fenótipo , Variação Biológica da PopulaçãoAssuntos
COVID-19 , SARS-CoV-2 , Humanos , Redes Reguladoras de Genes , Modelos Genéticos , MacrófagosRESUMO
Post-embryonic plant development is characterized by a period of vegetative growth during which a combination of intrinsic and extrinsic signals triggers the transition to the reproductive phase. To understand how different flowering inducing and repressing signals are associated with phase transitions of the Shoot Apical Meristem (SAM), we incorporated available data into a dynamic gene regulatory network model for Arabidopsis thaliana. This Flowering Transition Gene Regulatory Network (FT-GRN) formally constitutes a dynamic system-level mechanism based on more than three decades of experimental data on flowering. We provide novel experimental data on the regulatory interactions of one of its twenty-three components: a MADS-box transcription factor XAANTAL2 (XAL2). These data complement the information regarding flowering transition under short days and provides an example of the type of questions that can be addressed by the FT-GRN. The resulting FT-GRN is highly connected and integrates developmental, hormonal, and environmental signals that affect developmental transitions at the SAM. The FT-GRN is a dynamic multi-stable Boolean system, with 223 possible initial states, yet it converges into only 32 attractors. The latter are coherent with the expression profiles of the FT-GRN components that have been experimentally described for the developmental stages of the SAM. Furthermore, the attractors are also highly robust to initial states and to simulated perturbations of the interaction functions. The model recovered the meristem phenotypes of previously described single mutants. We also analyzed the attractors landscape that emerges from the postulated FT-GRN, uncovering which set of signals or components are critical for reproductive competence and the time-order transitions observed in the SAM. Finally, in the context of such GRN, the role of XAL2 under short-day conditions could be understood. Therefore, this model constitutes a robust biological module and the first multi-stable, dynamical systems biology mechanism that integrates the genetic flowering pathways to explain SAM phase transitions.
RESUMO
Leishmania amazonensis and Leishmania major are the causative agents of cutaneous and mucocutaneous diseases. The infections' outcome depends on host-parasite interactions and Th1/Th2 response, and in cutaneous form, regulation of Th17 cytokines has been reported to maintain inflammation in lesions. Despite that, the Th17 regulatory scenario remains unclear. With the aim to gain a better understanding of the transcription factors (TFs) and genes involved in Th17 induction, in this study, the role of inducing factors of the Th17 pathway in Leishmania-macrophage infection was addressed through computational modeling of gene regulatory networks (GRNs). The Th17 GRN modeling integrated experimentally validated data available in the literature and gene expression data from a time-series RNA-seq experiment (4, 24, 48, and 72 h post-infection). The generated model comprises a total of 10 TFs, 22 coding genes, and 16 cytokines related to the Th17 immune modulation. Addressing the Th17 induction in infected and uninfected macrophages, an increase of 2- to 3-fold in 4-24 h was observed in the former. However, there was a decrease in basal levels at 48-72 h for both groups. In order to evaluate the possible outcomes triggered by GRN component modulation in the Th17 pathway. The generated GRN models promoted an integrative and dynamic view of Leishmania-macrophage interaction over time that extends beyond the analysis of single-gene expression.
Assuntos
Leishmania major , Leishmania mexicana , Leishmaniose , Citocinas/metabolismo , Redes Reguladoras de Genes , Humanos , Leishmania mexicana/genética , Leishmania mexicana/metabolismo , MacrófagosRESUMO
Introduction: Staphylococcus aureus is one of the most prevalent and relevant pathogens responsible for a wide spectrum of hospital-associated or community-acquired infections. In addition, methicillin-resistant Staphylococcus aureus may display multidrug resistance profiles that complicate treatment and increase the mortality rate. The ability to produce biofilm, particularly in device-associated infections, promotes chronic and potentially more severe infections originating from the primary site. Understanding the complex mechanisms involved in planktonic and biofilm growth is critical to identifying regulatory connections and ways to overcome the global health problem of multidrug-resistant bacteria. Methods: In this work, we apply literature-based and comparative genomics approaches to reconstruct the gene regulatory network of the high biofilm-producing strain Bmb9393, belonging to one of the highly disseminating successful clones, the Brazilian epidemic clone. To the best of our knowledge, we describe for the first time the topological properties and network motifs for the Staphylococcus aureus pathogen. We performed this analysis using the ST239-SCCmecIII Bmb9393 strain. In addition, we analyzed transcriptomes available in the literature to construct a set of genes differentially expressed in the biofilm, covering different stages of the biofilms and genetic backgrounds of the strains. Results and discussion: The Bmb9393 gene regulatory network comprises 1,803 regulatory interactions between 64 transcription factors and the non-redundant set of 1,151 target genes with the inclusion of 19 new regulons compared to the N315 transcriptional regulatory network published in 2011. In the Bmb9393 network, we found 54 feed-forward loop motifs, where the most prevalent were coherent type 2 and incoherent type 2. The non-redundant set of differentially expressed genes in the biofilm consisted of 1,794 genes with functional categories relevant for adaptation to the variable microenvironments established throughout the biofilm formation process. Finally, we mapped the set of genes with altered expression in the biofilm in the Bmb9393 gene regulatory network to depict how different growth modes can alter the regulatory systems. The data revealed 45 transcription factors and 876 shared target genes. Thus, the gene regulatory network model provided represents the most up-to-date model for Staphylococcus aureus, and the set of genes altered in the biofilm provides a global view of their influence on biofilm formation from distinct experimental perspectives and different strain backgrounds.
RESUMO
BACKGROUND Healthcare-associated infections due to multidrug-resistant (MDR) bacteria such as Pseudomonas aeruginosa are significant public health issues worldwide. A system biology approach can help understand bacterial behaviour and provide novel ways to identify potential therapeutic targets and develop new drugs. Gene regulatory networks (GRN) are examples of in silico representation of interaction between regulatory genes and their targets. OBJECTIVES In this work, we update the MDR P. aeruginosa CCBH4851 GRN reconstruction and analyse and discuss its structural properties. METHODS We based this study on the gene orthology inference methodology using the reciprocal best hit method. The P. aeruginosa CCBH4851 genome and GRN, published in 2019, and the P. aeruginosa PAO1 GRN, published in 2020, were used for this update reconstruction process. FINDINGS Our result is a GRN with a greater number of regulatory genes, target genes, and interactions compared to the previous networks, and its structural properties are consistent with the complexity of biological networks and the biological features of P. aeruginosa. MAIN CONCLUSIONS Here, we present the largest and most complete version of P. aeruginosa GRN published to this date, to the best of our knowledge.
RESUMO
Botrytis cinerea and Trichoderma atroviride are two relevant fungi in agricultural systems. To gain insights into these organisms' transcriptional gene regulatory networks (GRNs), we generated a manually curated transcription factor (TF) dataset for each of them, followed by a GRN inference utilizing available sequence motifs describing DNA-binding specificity and global gene expression data. As a proof of concept of the usefulness of this resource to pinpoint key transcriptional regulators, we employed publicly available transcriptomics data and a newly generated dual RNA-seq dataset to build context-specific Botrytis and Trichoderma GRNs under two different biological paradigms: exposure to continuous light and Botrytis-Trichoderma confrontation assays. Network analysis of fungal responses to constant light revealed striking differences in the transcriptional landscape of both fungi. On the other hand, we found that the confrontation of both microorganisms elicited a distinct set of differentially expressed genes with changes in T. atroviride exceeding those in B. cinerea. Using our regulatory network data, we were able to determine, in both fungi, central TFs involved in this interaction response, including TFs controlling a large set of extracellular peptidases in the biocontrol agent T. atroviride. In summary, our work provides a comprehensive catalog of transcription factors and regulatory interactions for both organisms. This catalog can now serve as a basis for generating novel hypotheses on transcriptional regulatory circuits in different experimental contexts.
RESUMO
Cancer is a genomic disease involving various intertwined pathways with complex cross-communication links. Conceptually, this complex interconnected system forms a network, which allows one to model the dynamic behavior of the elements that characterize it to describe the entire system's development in its various evolutionary stages of carcinogenesis. Knowing the activation or inhibition status of the genes that make up the network during its temporal evolution is necessary for the rational intervention on the critical factors for controlling the system's dynamic evolution. In this report, we proposed a methodology for building data-driven boolean networks that model breast cancer tumors. We defined the network components and topology based on gene expression data from RNA-seq of breast cancer cell lines. We used a Boolean logic formalism to describe the network dynamics. The combination of single-cell RNA-seq and interactome data enabled us to study the dynamics of malignant subnetworks of up-regulated genes. First, we used the same Boolean function construction scheme for each network node, based on canalyzing functions. Using single-cell breast cancer datasets from The Cancer Genome Atlas, we applied a binarization algorithm. The binarized version of scRNA-seq data allowed identifying attractors specific to patients and critical genes related to each breast cancer subtype. The model proposed in this report may serve as a basis for a methodology to detect critical genes involved in malignant attractor stability, whose inhibition could have potential applications in cancer theranostics.
RESUMO
Gene Regulatory Networks (GRNs) allow the study of regulation of gene expression of whole genomes. Among the most relevant advantages of using networks to depict this key process, there is the visual representation of large amounts of information and the application of graph theory to generate new knowledge. Nonetheless, despite the many uses of GRNs, it is still difficult and expensive to assign Transcription Factors (TFs) to the regulation of specific genes. ChIP-Seq allows the determination of TF Binding Sites (TFBSs) over whole genomes, but it is still an expensive technique that can only be applied one TF at a time and requires replicates to reduce its noise. Once TFBSs are determined, the assignment of each TF and its binding sites to the regulation of specific genes is not trivial, and it is often performed by carrying out site-specific experiments that are unfeasible to perform in all possible binding sites. Here, we addressed these relevant issues with a two-step methodology using Drosophila melanogaster as a case study. First, our protocol starts by gathering all transcription factor binding sites (TFBSs) determined with ChIP-Seq experiments available at ENCODE and FlyBase. Then each TFBS is used to assign TFs to the regulation of likely target genes based on the TFBS proximity to the transcription start site of all genes. In the final step, to try to select the most likely regulatory TF from those previously assigned to each gene, we employ GENIE3, a random forest-based method, and more than 9,000 RNA-seq experiments from D. melanogaster. Following, we employed known TF protein-protein interactions to estimate the feasibility of regulatory events in our filtered networks. Finally, we show how known interactions between co-regulatory TFs of each gene increase after the second step of our approach, and thus, the consistency of the TF-gene assignment. Also, we employed our methodology to create a network centered on the Drosophila melanogaster gene Hr96 to demonstrate the role of this transcription factor on mitochondrial gene regulation.