Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters











Database
Language
Publication year range
1.
Bioinformatics ; 39(8)2023 08 01.
Article in English | MEDLINE | ID: mdl-37527019

ABSTRACT

MOTIVATION: Many real-world problems can be modeled as annotated graphs. Scalable graph algorithms that extract actionable information from such data are in demand since these graphs are large, varying in topology, and have diverse node/edge annotations. When these graphs change over time they create dynamic graphs, and open the possibility to find patterns across different time points. In this article, we introduce a scalable algorithm that finds unique dense regions across time points in dynamic graphs. Such algorithms have applications in many different areas, including the biological, financial, and social domains. RESULTS: There are three important contributions to this manuscript. First, we designed a scalable algorithm, USNAP, to effectively identify dense subgraphs that are unique to a time stamp given a dynamic graph. Importantly, USNAP provides a lower bound of the density measure in each step of the greedy algorithm. Second, insights and understanding obtained from validating USNAP on real data show its effectiveness. While USNAP is domain independent, we applied it to four non-small cell lung cancer gene expression datasets. Stages in non-small cell lung cancer were modeled as dynamic graphs, and input to USNAP. Pathway enrichment analyses and comprehensive interpretations from literature show that USNAP identified biologically relevant mechanisms for different stages of cancer progression. Third, USNAP is scalable, and has a time complexity of O(m+mc log nc+nc log nc), where m is the number of edges, and n is the number of vertices in the dynamic graph; mc is the number of edges, and nc is the number of vertices in the collapsed graph. AVAILABILITY AND IMPLEMENTATION: The code of USNAP is available at https://www.cs.utoronto.ca/~juris/data/USNAP22.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Algorithms
2.
Methods Mol Biol ; 2401: 51-68, 2022.
Article in English | MEDLINE | ID: mdl-34902122

ABSTRACT

Gene expression microarrays are one of the most widely used high-throughput technologies in molecular biology, with applications such as identification of disease mechanisms and development of diagnostic and prognostic gene signatures. However, the success of these tasks is often limited because microarray analysis does not account for the complex relationships among genes, their products, and overall signaling and regulatory cascades. Incorporating protein-protein interaction data into microarray analysis can help address these challenges. This chapter reviews how protein-protein interactions can help with microarray analysis, leading to benefits such as better explanations of disease mechanisms, more complete gene annotations, improved prioritization of genes for future experiments, and gene signatures that generalize better to new data.


Subject(s)
Microarray Analysis , Biological Phenomena , Computational Biology , Gene Expression Profiling , Molecular Sequence Annotation
3.
Methods ; 132: 34-41, 2018 01 01.
Article in English | MEDLINE | ID: mdl-28684340

ABSTRACT

Can we use graph mining algorithms to find patterns in tumor molecular mechanisms? Can we model disease progression with multiple time-specific graph comparison algorithms? In this paper, we will focus on this area. Our main contributions are 1) we proposed the Temporal-Omics (Temp-O) workflow to model tumor progression in non-small cell lung cancer (NSCLC) using graph comparisons between multiple stage-specific graphs, and 2) we showed that temporal structures are meaningful in the tumor progression of NSCLC. Other identified temporal structures that were not highlighted in this paper may also be used to gain insights to possible novel mechanisms. Importantly, the Temp-O workflow is generic; while we applied it on NSCLC, it can be applied in other cancers and diseases. We used gene expression data from tumor samples across disease stages to model lung cancer progression, creating stage-specific tumor graphs. Validating our findings in independent datasets showed that differences in temporal network structures capture diverse mechanisms in NSCLC. Furthermore, results showed that structures are consistent and potentially biologically important as we observed that genes with similar protein names were captured in the same cliques for all cliques in all datasets. Importantly, the identified temporal structures are meaningful in the tumor progression of NSCLC as they agree with the molecular mechanism in the tumor progression or carcinogenesis of NSCLC. In particular, the identified major histocompatibility complex of class II temporal structures capture mechanisms concerning carcinogenesis; the proteasome temporal structures capture mechanisms that are in early or late stages of lung cancer; the ribosomal cliques capture the role of ribosome biosynthesis in cancer development and sustainment. Further, on a large independent dataset we validated that temporal network structures identified proteins that are prognostic for overall survival in NSCLC adenocarcinoma.


Subject(s)
Carcinoma, Non-Small-Cell Lung/pathology , Lung Neoplasms/pathology , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/metabolism , Carcinoma, Non-Small-Cell Lung/mortality , Disease Progression , Gene Regulatory Networks , Humans , Kaplan-Meier Estimate , Lung Neoplasms/genetics , Lung Neoplasms/metabolism , Lung Neoplasms/mortality , Models, Biological , Molecular Sequence Annotation , Transcriptome
4.
Proteomics ; 15(2-3): 608-17, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25283527

ABSTRACT

While current protein interaction data provides a rich resource for molecular biology, it mostly lacks condition-specific details. Abundance of mRNA data for most diseases provides potential to model condition-specific transcriptional changes. Transcriptional data enables modeling disease mechanisms, and in turn provide potential treatments. While approaches to compare networks constructed from healthy and disease samples have been developed, they do not provide the complete comparison, evaluations are performed on very small networks, or no systematic network analyses are performed on differential network structures. We propose a novel method for efficiently exploiting network structure information in the comparison between any graphs, and validate results in non-small cell lung cancer. We introduce the notion of differential graphlet community to detect deregulated subgraphs between any graphs such that the network structure information is exploited. The differential graphlet community approach systematically captures network structure differences between any graphs. Instead of using connectivity of each protein or each edge, we used shortest path distributions on differential graphlet communities in order to exploit network structure information on identified deregulated subgraphs. We validated the method by analyzing three non-small cell lung cancer datasets and validated results on four independent datasets. We observed that the shortest path lengths are significantly longer for normal graphs than for tumor graphs between genes that are in differential graphlet communities, suggesting that tumor cells create "shortcuts" between biological processes that may not be present in normal conditions.


Subject(s)
Carcinoma, Non-Small-Cell Lung/metabolism , Lung Neoplasms/metabolism , Protein Interaction Mapping/methods , Protein Interaction Maps , Systems Biology/methods , Carcinoma, Non-Small-Cell Lung/genetics , Gene Expression Regulation, Neoplastic , Humans , Lung/metabolism , Lung Neoplasms/genetics , Proteins/genetics , Proteins/metabolism , Signal Transduction
SELECTION OF CITATIONS
SEARCH DETAIL