Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
1.
J Biosci ; 492024.
Article in English | MEDLINE | ID: mdl-39119913

ABSTRACT

Single-cell RNA sequencing (scRNA-Seq) technology provides the scope to gain insight into the interplay between intrinsic cellular processes as well as transcriptional and behavioral changes in gene-gene interactions across varying conditions. The high level of scarcity of scRNA-seq data, however, poses a significant challenge for analysis. We propose a complete differential co-expression (DCE) analysis framework for scRNA-Seq data to extract network modules and identify hub-genes. The performance of our method has been shown to be satisfactory after validation using an scRNA-Seq esophageal squamous cell carcinoma (ESCC) dataset. From comparison with four other existing hub-gene finding methods, it has been observed that our method performs better in the majority of cases and has the ability to identify unique potential biomarkers that were not detected by the other methods. The potential biomarker genes identified by our framework, differential co-expression analysis method for single-cell RNA sequencing data (scDiffCoAM), have been validated both statistically and biologically.


Subject(s)
Biomarkers, Tumor , Esophageal Neoplasms , Esophageal Squamous Cell Carcinoma , Gene Expression Regulation, Neoplastic , Sequence Analysis, RNA , Single-Cell Analysis , Humans , Esophageal Squamous Cell Carcinoma/genetics , Esophageal Squamous Cell Carcinoma/pathology , Single-Cell Analysis/methods , Biomarkers, Tumor/genetics , Esophageal Neoplasms/genetics , Esophageal Neoplasms/pathology , Sequence Analysis, RNA/methods , Gene Expression Profiling/methods , Gene Regulatory Networks/genetics , RNA-Seq/methods , Single-Cell Gene Expression Analysis
3.
Comput Struct Biotechnol J ; 21: 812-836, 2023.
Article in English | MEDLINE | ID: mdl-36698967

ABSTRACT

Chromosome conformation capture (3 C) is a method of measuring chromosome topology in terms of loci interaction. The Hi-C method is a derivative of 3 C that allows for genome-wide quantification of chromosome interaction. From such interaction data, it is possible to infer the three-dimensional (3D) structure of the underlying chromosome. In this paper, we developed a novel method, HiC-GNN, for predicting the 3D structures of chromosomes from Hi-C data. HiC-GNN is unique from other methods for chromosome structure prediction in that the models learned by HiC-GNN can be generalized to data that is distinct from the training data. This aspect of HiC-GNN allows models that were trained on one Hi-C contact map to be used for inference on entirely different maps. To the authors' knowledge, this generalizing capability is not present in any existing methods. HiC-GNN uses a node embedding algorithm and a graph neural network to predict the 3D coordinates of each genomic loci from the corresponding Hi-C contact data. Unlike other methods, our algorithm allows for the storage of pre-trained parameters, thus enabling prediction on data that is entirely different from the training data. We show that our method can accurately generalize a single model across Hi-C resolutions, multiple restriction enzymes, and multiple cell populations while maintaining reconstruction accuracy across three Hi-C datasets. Our algorithm outperforms the state-of-the-art methods in accuracy of prediction and runtime and introduces a novel method for 3D structure prediction from Hi-C data. All our source codes and data are available at https://github.com/OluwadareLab/HiC-GNN.

4.
SN Comput Sci ; 4(2): 114, 2023.
Article in English | MEDLINE | ID: mdl-36573207

ABSTRACT

This paper presents a consensus-based approach that incorporates three microarray and three RNA-Seq methods for unbiased and integrative identification of differentially expressed genes (DEGs) as potential biomarkers for critical disease(s). The proposed method performs satisfactorily on two microarray datasets (GSE20347 and GSE23400) and one RNA-Seq dataset (GSE130078) for esophageal squamous cell carcinoma (ESCC). Based on the input dataset, our framework employs specific DE methods to detect DEGs independently. A consensus based function that first considers DEGs common to all three methods for further downstream analysis has been introduced. The consensus function employs other parameters to overcome information loss. Differential co-expression (DCE) and preservation analysis of DEGs facilitates the study of behavioral changes in interactions among DEGs under normal and diseased circumstances. Considering hub genes in biologically relevant modules and most GO and pathway enriched DEGs as candidates for potential biomarkers of ESCC, we perform further validation through biological analysis as well as literature evidence. We have identified 25 DEGs that have strong biological relevance to their respective datasets and have previous literature establishing them as potential biomarkers for ESCC. We have further identified 8 additional DEGs as probable potential biomarkers for ESCC, but recommend further in-depth analysis.

5.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36534961

ABSTRACT

The inference of large-scale gene regulatory networks is essential for understanding comprehensive interactions among genes. Most existing methods are limited to reconstructing networks with a few hundred nodes. Therefore, parallel computing paradigms must be leveraged to construct large networks. We propose a generic parallel framework that enables any existing method, without re-engineering, to infer large networks in parallel, guaranteeing quality output. The framework is tested on 15 inference methods (not limited to) employing in silico benchmarks and real-world large expression matrices, followed by qualitative and speedup assessment. The framework does not compromise the quality of the base serial inference method. We rank the candidate methods and use the top-performing method to infer an Alzheimer's Disease (AD) affected network from large expression profiles of a triple transgenic mouse model consisting of 45,101 genes. The resultant network is further explored to obtain hub genes that emerge functionally related to the disease. We partition the network into 41 modules and conduct pathway enrichment analysis, revealing that a good number of participating genes are collectively responsible for several brain disorders, including AD. Finally, we extract the interactions of a few known AD genes and observe that they are periphery genes connected to the network's hub genes. Availability: The R implementation of the framework is downloadable from https://github.com/Netralab/GenericParallelFramework.


Subject(s)
Alzheimer Disease , Gene Regulatory Networks , Animals , Mice , Alzheimer Disease/genetics , Brain , Animals, Genetically Modified , Algorithms
6.
BMC Bioinformatics ; 23(1): 17, 2022 Jan 06.
Article in English | MEDLINE | ID: mdl-34991439

ABSTRACT

BACKGROUND: A limitation of traditional differential expression analysis on small datasets involves the possibility of false positives and false negatives due to sample variation. Considering the recent advances in deep learning (DL) based models, we wanted to expand the state-of-the-art in disease biomarker prediction from RNA-seq data using DL. However, application of DL to RNA-seq data is challenging due to absence of appropriate labels and smaller sample size as compared to number of genes. Deep learning coupled with transfer learning can improve prediction performance on novel data by incorporating patterns learned from other related data. With the emergence of new disease datasets, biomarker prediction would be facilitated by having a generalized model that can transfer the knowledge of trained feature maps to the new dataset. To the best of our knowledge, there is no Convolutional Neural Network (CNN)-based model coupled with transfer learning to predict the significant upregulating (UR) and downregulating (DR) genes from both trained and untrained datasets. RESULTS: We implemented a CNN model, DEGnext, to predict UR and DR genes from gene expression data obtained from The Cancer Genome Atlas database. DEGnext uses biologically validated data along with logarithmic fold change values to classify differentially expressed genes (DEGs) as UR and DR genes. We applied transfer learning to our model to leverage the knowledge of trained feature maps to untrained cancer datasets. DEGnext's results were competitive (ROC scores between 88 and 99[Formula: see text]) with those of five traditional machine learning methods: Decision Tree, K-Nearest Neighbors, Random Forest, Support Vector Machine, and XGBoost. DEGnext was robust and effective in terms of transferring learned feature maps to facilitate classification of unseen datasets. Additionally, we validated that the predicted DEGs from DEGnext were mapped to significant Gene Ontology terms and pathways related to cancer. CONCLUSIONS: DEGnext can classify DEGs into UR and DR genes from RNA-seq cancer datasets with high performance. This type of analysis, using biologically relevant fine-tuning data, may aid in the exploration of potential biomarkers and can be adapted for other disease datasets.


Subject(s)
Neoplasms , Neural Networks, Computer , Humans , Machine Learning , RNA-Seq , Support Vector Machine
7.
IEEE Trans Neural Netw Learn Syst ; 33(12): 7706-7716, 2022 12.
Article in English | MEDLINE | ID: mdl-34138724

ABSTRACT

Most modern neural networks for classification fail to take into account the concept of the unknown. Trained neural networks are usually tested in an unrealistic scenario with only examples from a closed set of known classes. In an attempt to develop a more realistic model, the concept of working in an open set environment has been introduced. This in turn leads to the concept of incremental learning where a model with its own architecture and initial trained set of data can identify unknown classes during the testing phase and autonomously update itself if evidence of a new class is detected. Some problems that arise in incremental learning are inefficient use of resources to retrain the classifier repeatedly and the decrease of classification accuracy as multiple classes are added over time. This process of instantiating new classes is repeated as many times as necessary, accruing errors. To address these problems, this article proposes the classification confidence threshold (CT) approach to prime neural networks for incremental learning to keep accuracies high by limiting forgetting. A lean method is also used to reduce resources used in the retraining of the neural network. The proposed method is based on the idea that a network is able to incrementally learn a new class even when exposed to a limited number samples associated with the new class. This method can be applied to most existing neural networks with minimal changes to network architecture.


Subject(s)
Machine Learning , Neural Networks, Computer , Learning
8.
Comput Biol Med ; 137: 104820, 2021 10.
Article in English | MEDLINE | ID: mdl-34508973

ABSTRACT

scRNA-seq data analysis enables new possibilities for identification of novel cells, specific characterization of known cells and study of cell heterogeneity. The performance of most clustering methods especially developed for scRNA-seq is greatly influenced by user input. We propose a centrality-clustering method named UICPC and compare its performance with 9 state-of-the-art clustering methods on 11 real-world scRNA-seq datasets to demonstrate its effectiveness and usefulness in discovering cell groups. Our method does not require user input. However, it requires settings of threshold, which are benchmarked after performing extensive experiments. We observe that most compared approaches show poor performance due to high heterogeneity and large dataset dimensions. However, UICPC shows excellent performance in terms of NMI, Purity and ARI, respectively. UICPC is available as an R package and can be downloaded by clicking the link https://sites.google.com/view/hussinchowdhury/software.


Subject(s)
RNA, Small Cytoplasmic , Algorithms , Cluster Analysis , Data Analysis , Gene Expression Profiling , Sequence Analysis, RNA , Single-Cell Analysis , Software
9.
Med Biol Eng Comput ; 59(4): 989-1004, 2021 Apr.
Article in English | MEDLINE | ID: mdl-33840048

ABSTRACT

Effective biomarkers aid in the early diagnosis and monitoring of breast cancer and thus play an important role in the treatment of patients suffering from the disease. Growing evidence indicates that alteration of expression levels of miRNA is one of the principal causes of cancer. We analyze breast cancer miRNA data to discover a list of biclusters as well as breast cancer miRNA biomarkers which can help to understand better this critical disease and take important clinical decisions for treatment and diagnosis. In this paper, we propose a pattern-based parallel biclustering algorithm termed Rank-Preserving Biclustering (RPBic). The key strategy is to identify rank-preserved rows under a subset of columns based on a modified version of all substrings common subsequence (ALCS) framework. To illustrate the effectiveness of the RPBic algorithm, we consider synthetic datasets and show that RPBic outperforms relevant biclustering algorithms in terms of relevance and recovery. For breast cancer data, we identify 68 biclusters and establish that they have strong clinical characteristics among the samples. The differentially co-expressed miRNAs are found to be involved in KEGG cancer related pathways. Moreover, we identify frequency-based biomarkers (hsa-miR-410, hsa-miR-483-5p) and network-based biomarkers (hsa-miR-454, hsa-miR-137) which we validate to have strong connectivity with breast cancer. The source code and the datasets used can be found at http://agnigarh.tezu.ernet.in/~rosy8/Bioinformatics_RPBic_Data.rar . Graphical Abstract.


Subject(s)
Breast Neoplasms , MicroRNAs , Algorithms , Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Female , Gene Expression Profiling , Humans , MicroRNAs/genetics
10.
IEEE Trans Neural Netw Learn Syst ; 32(2): 604-624, 2021 02.
Article in English | MEDLINE | ID: mdl-32324570

ABSTRACT

Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This article provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to many applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.


Subject(s)
Deep Learning , Natural Language Processing , Computer Systems , Humans , Linguistics , Neural Networks, Computer , Surveys and Questionnaires
11.
Comput Biol Med ; 126: 104023, 2020 11.
Article in English | MEDLINE | ID: mdl-33049478

ABSTRACT

Many complex diseases occur due to genetic factors. A perturbation in the pathway of gene interactions leads to such disorders. Even though a group of genes is responsible, a few significant genes act as a biomarker for disease, perturbing the healthy network. Identifying such marker genes or a set of genes that play a pivotal role in diseases helps drug prioritization. We propose a scheme for finding potential bio-markers using a multi-layer consensus-driven approach. We reconstruct a functional module guided disease sub-network, followed by a multi-step consensus of network inference methods and shared ontological terms. We perform centrality analysis on the sub-networks under consideration and report hub genes as potentially key players in the target disease. To establish our scheme's effectiveness, we use Alzheimer's Disease (AD) and Breast Cancer as candidate diseases for experimentation. We evaluate the significance of prioritized genes based on reported evidence. We observe that BRCA1, BRCA2, and PTEN are the essential genes for Breast Cancer, whereas MAPK1, APP, and CASP7 are the essential genes playing an important role during AD.


Subject(s)
Alzheimer Disease , Gene Regulatory Networks , Alzheimer Disease/genetics , Biomarkers , Consensus , Gene Expression Profiling , Gene Regulatory Networks/genetics , Humans
12.
J Biosci ; 452020.
Article in English | MEDLINE | ID: mdl-32098912

ABSTRACT

A gene co-expression network (CEN) is of biological interest, since co-expressed genes share common functions and biological processes or pathways. Finding relationships among modules can reveal inter-modular preservation, and similarity in transcriptome, functional, and biological behaviors among modules of the same or two different datasets. There is no method which explores the one-to-one relationships and one-to-many relationships among modules extracted from control and disease samples based on both topological and semantic similarity using both microarray and RNA seq data. In this work, we propose a novel fusion measure to detect mapping between modules from two sets of co-expressed modules extracted from control and disease stages of Alzheimer's disease (AD) and Parkinson's disease (PD) datasets. Our measure considers both topological and biological information of a module and is an estimation of four parameters, namely, semantic similarity, eigengene correlation, degree difference, and the number of common genes. We analyze the consensus modules shared between both control and disease stages in terms of their association with diseases. We also validate the close associations between human and chimpanzee modules and compare with the state-ofthe- art method. Additionally, we propose two novel observations on the relationships between modules for further analysis.


Subject(s)
Gene Expression Regulation , Gene Regulatory Networks/physiology , Transcriptome , Algorithms , Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Animals , Databases, Genetic , Humans , Pan troglodytes , Parkinson Disease/genetics , Parkinson Disease/metabolism
13.
Article in English | MEDLINE | ID: mdl-29994618

ABSTRACT

Causality inference is the use of computational techniques to predict possible causal relationships for a set of variables, thereby forming a directed network. Causality inference in Gene Regulatory Networks (GRNs) is an important, yet challenging task due to the limits of available data and lack of efficiency in existing causality inference techniques. A number of techniques have been proposed and applied to infer causal relationships in various domains, although they are not specific to regulatory network inference. In this paper, we assess the effectiveness of methods for inferring causal GRNs. We introduce seven different inference methods and apply them to infer directed edges in GRNs. We use time-series expression data from the DREAM challenges to assess the methods in terms of quality of inference and rank them based on performance. The best method is applied to Breast Cancer data to infer a causal network. Experimental results show that Causation Entropy is best, however, highly time-consuming and not feasible to use in a relatively large network. We infer Breast Cancer GRN with the second-best method, Transfer Entropy. The topological analysis of the network reveals that top out-degree genes such as SLC39A5 which are considered central genes, play important role in cancer progression.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks/genetics , Models, Statistical , Breast Neoplasms/genetics , Causality , Entropy , Escherichia coli/genetics , Female , Humans , Transcriptome/genetics , Yeasts/genetics
14.
IEEE/ACM Trans Comput Biol Bioinform ; 17(4): 1154-1173, 2020.
Article in English | MEDLINE | ID: mdl-30668502

ABSTRACT

Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.


Subject(s)
Gene Expression Profiling , Animals , Gene Expression Profiling/methods , Gene Expression Profiling/standards , Gene Regulatory Networks/genetics , Humans , Oligonucleotide Array Sequence Analysis , RNA-Seq , Transcriptome/genetics
15.
Article in English | MEDLINE | ID: mdl-30281477

ABSTRACT

Analysis of RNA-sequence (RNA-seq) data is widely used in transcriptomic studies and it has many applications. We review RNA-seq data analysis from RNA-seq reads to the results of differential expression analysis. In addition, we perform a descriptive comparison of tools used in each step of RNA-seq data analysis along with a discussion of important characteristics of these tools. A taxonomy of tools is also provided. A discussion of issues in quality control and visualization of RNA-seq data is also included along with useful tools. Finally, we provide some guidelines for the RNA-seq data analyst, along with research issues and challenges which should be addressed.


Subject(s)
Computational Biology/methods , Gene Expression Profiling/methods , RNA/genetics , Sequence Analysis, RNA/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Quality Control , Sequence Analysis, RNA/standards , Software , Transcriptome/genetics
16.
Comput Biol Med ; 113: 103380, 2019 10.
Article in English | MEDLINE | ID: mdl-31415946

ABSTRACT

In the recent past, a number of methods have been developed for analysis of biological data. Among these methods, gene co-expression networks have the ability to mine functionally related genes with similar co-expression patterns, because of which such networks have been most widely used. However, gene co-expression networks cannot identify genes, which undergo condition specific changes in their relationships with other genes. In contrast, differential co-expression analysis enables finding co-expressed genes exhibiting significant changes across disease conditions. In this paper, we present some significant outcomes of a comparative study of four co-expression network module detection techniques, namely, THD-Module Extractor, DiffCoEx, MODA, and WGCNA, which can perform differential co-expression analysis on both gene and miRNA expression data (microarray and RNA-seq) and discuss the applications to Alzheimer's disease and Parkinson's disease research. Our observations reveal that compared to other methods, THD-Module Extractor is the most effective in finding modules with higher functional relevance and biological significance.


Subject(s)
Alzheimer Disease , Databases, Genetic , Gene Expression Profiling , Gene Regulatory Networks , Parkinson Disease , Transcriptome , Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Biomarkers/metabolism , Humans , Parkinson Disease/genetics , Parkinson Disease/metabolism
17.
Comput Biol Chem ; 77: 373-389, 2018 Dec.
Article in English | MEDLINE | ID: mdl-30466046

ABSTRACT

Genes interact with each other and may cause perturbation in the molecular pathways leading to complex diseases. Often, instead of any single gene, a subset of genes interact, forming a network, to share common biological functions. Such a subnetwork is called a functional module or motif. Identifying such modules and central key genes in them, that may be responsible for a disease, may help design patient-specific drugs. In this study, we consider the neurodegenerative Alzheimer's Disease (AD) and identify potentially responsible genes from functional motif analysis. We start from the hypothesis that central genes in genetic modules are more relevant to a disease that is under investigation and identify hub genes from the modules as potential marker genes. Motifs or modules are often non-exclusive or overlapping in nature. Moreover, they sometimes show intrinsic or hierarchical distributions with overlapping functional roles. To the best of our knowledge, no prior work handles both the situations in an integrated way. We propose a non-exclusive clustering approach, CluViaN (Clustering Via Network) that can detect intrinsic as well as overlapping modules from gene co-expression networks constructed using microarray expression profiles. We compare our method with existing methods to evaluate the quality of modules extracted. CluViaN reports the presence of intrinsic and overlapping motifs in different species not reported by any other research. We further apply our method to extract significant AD specific modules using CluViaN and rank them based the number of genes from a module involved in the disease pathways. Finally, top central genes are identified by topological analysis of the modules. We use two different AD phenotype data for experimentation. We observe that central genes, namely PSEN1, APP, NDUFB2, NDUFA1, UQCR10, PPP3R1 and a few more, play significant roles in the AD. Interestingly, our experiments also find a hub gene, PML, which has recently been reported to play a role in plasticity, circadian rhythms and the response to proteins which can cause neurodegenerative disorders. MUC4, another hub gene that we find experimentally is yet to be investigated for its potential role in AD. A software implementation of CluViaN in Java is available for download at https://sites.google.com/site/swarupnehu/publications/resources/CluViaN Software.rar.


Subject(s)
Alzheimer Disease/genetics , Gene Regulatory Networks , Algorithms , Cluster Analysis , Gene Expression Regulation , Genomics/methods , Humans , Phenotype , Transcriptome
18.
Comput Biol Chem ; 75: 154-167, 2018 Aug.
Article in English | MEDLINE | ID: mdl-29787933

ABSTRACT

Developing a cost-effective and robust triclustering algorithm that can identify triclusters of high biological significance in the gene-sample-time (GST) domain is a challenging task. Most existing triclustering algorithms can detect shifting and scaling patterns in isolation, they are not able to handle co-occurring shifting-and-scaling patterns. This paper makes an attempt to address this issue. It introduces a robust triclustering algorithm called THD-Tricluster to identify triclusters over the GST domain. In addition to applying over several benchmark datasets for its validation, the proposed THD-Tricluster algorithm was applied on HIV-1 progression data to identify disease-specific genes. THD-Tricluster could identify 38 most responsible genes for the deadly disease which includes GATA3, EGR1, JUN, ELF1, AGFG1, AGFG2, CX3CR1, CXCL12, CCR5, CCR2, and many others. The results are validated using GeneCard and other established results.


Subject(s)
Algorithms , HIV-1/genetics , Cluster Analysis , HIV-1/isolation & purification , Humans , Oligonucleotide Array Sequence Analysis
19.
Article in English | MEDLINE | ID: mdl-28113986

ABSTRACT

Network Alignment over graph-structured data has received considerable attention in many recent applications. Global network alignment tries to uniquely find the best mapping for a node in one network to only one node in another network. The mapping is performed according to some matching criteria that depend on the nature of data. In molecular biology, functional orthologs, protein complexes, and evolutionary conserved pathways are some examples of information uncovered by global network alignment. Current techniques for global network alignment suffer from several drawbacks, e.g., poor performance and high memory requirements. We address these problems by proposing IBNAL, Indexes-Based Network ALigner, for better alignment quality and faster results. To accelerate the alignment step, IBNAL makes use of a novel clique-based index and is able to align large networks in seconds. IBNAL produces a higher topological quality alignment and comparable biological match in alignment relative to other state-of-the-art aligners even though topological fit is primarily used to match nodes. IBNAL's results confirm and give another evidence that homology information is more likely to be encoded in network topology than sequence information.


Subject(s)
Computational Biology/methods , Protein Interaction Mapping/methods , Proteins/chemistry , Proteins/metabolism , Algorithms , Animals , Humans , Saccharomyces cerevisiae Proteins/chemistry , Saccharomyces cerevisiae Proteins/metabolism
20.
Sci Rep ; 7(1): 1072, 2017 04 21.
Article in English | MEDLINE | ID: mdl-28432361

ABSTRACT

Advancement in science has tended to improve treatment of fatal diseases such as cancer. A major concern in the area is the spread of cancerous cells, technically refered to as metastasis into other organs beyond the primary organ. Treatment in such a stage of cancer is extremely difficult and usually palliative only. In this study, we focus on finding gene-gene network modules which are functionally similar in nature in the case of breast cancer. These modules extracted during the disease progression stages are analyzed using p-value and their associated pathways. We also explore interesting patterns associated with the causal genes, viz., SCGB1D2, MET, CYP1B1 and MMP9 in terms of expression similarity and pathway contexts. We analyze the genes involved in both the stages- non metastasis and metastatsis and change in their expression values, their associated pathways and roles as the disease progresses from one stage to another. We discover three additional pathways viz., Glycerophospholipid metablism, h-Efp pathway and CARM1 and Regulation of Estrogen Receptor, which can be related to the metastasis phase of breast cancer. These new pathways can be further explored to identify their relevance during the progression of the disease.


Subject(s)
Biomarkers, Tumor/analysis , Breast Neoplasms/pathology , Breast Neoplasms/secondary , Gene Regulatory Networks , Breast Neoplasms/diagnosis , Disease Progression , Female , Glycerophospholipids/metabolism , Humans , Protein-Arginine N-Methyltransferases/analysis , Receptors, Estrogen/analysis , Transcription Factors/analysis , Tripartite Motif Proteins/analysis , Ubiquitin-Protein Ligases/analysis
SELECTION OF CITATIONS
SEARCH DETAIL