Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
1.
Exp Mol Med ; 2024 Jun 14.
Article in English | MEDLINE | ID: mdl-38871819

ABSTRACT

It is apparent that various functional units within the cellular machinery are derived from RNAs. The evolution of sequencing techniques has resulted in significant insights into approaches for transcriptome studies. Organisms utilize RNA to govern cellular systems, and a heterogeneous class of RNAs is involved in regulatory functions. In particular, regulatory RNAs are increasingly recognized to participate in intricately functioning machinery across almost all levels of biological systems. These systems include those mediating chromatin arrangement, transcription, suborganelle stabilization, and posttranscriptional modifications. Any class of RNA exhibiting regulatory activity can be termed a class of regulatory RNA and is typically represented by noncoding RNAs, which constitute a substantial portion of the genome. These RNAs function based on the principle of structural changes through cis and/or trans regulation to facilitate mutual RNA‒RNA, RNA‒DNA, and RNA‒protein interactions. It has not been clearly elucidated whether regulatory RNAs identified through deep sequencing actually function in the anticipated mechanisms. This review addresses the dominant properties of regulatory RNAs at various layers of the cellular machinery and covers regulatory activities, structural dynamics, modifications, associated molecules, and further challenges related to therapeutics and deep learning.

2.
PLoS One ; 19(5): e0303205, 2024.
Article in English | MEDLINE | ID: mdl-38809874

ABSTRACT

Cannabis-related emergency department visits have increased after legalization of cannabis for medical and recreational use. Accordingly, the incidence of emergency department visits due to cannabinoid hyperemesis syndrome in patients with chronic cannabis use has also increased. The aim of this study was to examine trends of emergency department visit due to cannabinoid hyperemesis syndrome in Nevada and evaluate factors associated with the increased risk for emergency department visit. The State Emergency Department Databases of Nevada between 2013 and 2021 were used for investigating trends of emergency department visits for cannabinoid hyperemesis syndrome. We compared patients visiting the emergency department due to cannabinoid hyperemesis syndrome with those visiting the emergency department due to other causes except cannabinoid hyperemesis and estimated the impact of cannabis commercialization for recreational use. Emergency department visits due to cannabinoid hyperemesis syndrome have continuously increased during the study period. The number of emergency department visits per 100,000 was 1.07 before commercialization for recreational use. It increased to 2.22 per 100,000 (by approximately 1.1 per 100,000) after commercialization in the third quarter of 2017. Those with cannabinoid hyperemesis syndrome were younger with fewer male patients than those without cannabinoid hyperemesis syndrome. A substantial increase in emergency department visits due to cannabinoid hyperemesis syndrome occurred in Nevada, especially after the commercialization of recreational cannabis. Further study is needed to explore factors associated with emergency department visits.


Subject(s)
Cannabinoids , Emergency Service, Hospital , Vomiting , Humans , Emergency Service, Hospital/statistics & numerical data , Male , Female , Adult , Vomiting/chemically induced , Vomiting/epidemiology , Nevada/epidemiology , Cannabinoids/adverse effects , Young Adult , Middle Aged , Adolescent , Syndrome , Incidence , Cannabinoid Hyperemesis Syndrome , Emergency Room Visits
3.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38807262

ABSTRACT

Sexual dimorphism in prevalence, severity and genetic susceptibility exists for most common diseases. However, most genetic and clinical outcome studies are designed in sex-combined framework considering sex as a covariate. Few sex-specific studies have analyzed males and females separately, which failed to identify gene-by-sex interaction. Here, we propose a novel unified biologically interpretable deep learning-based framework (named SPIN) for sexual dimorphism analysis. We demonstrate that SPIN significantly improved the C-index up to 23.6% in TCGA cancer datasets, and it was further validated using asthma datasets. In addition, SPIN identifies sex-specific and -shared risk loci that are often missed in previous sex-combined/-separate analysis. We also show that SPIN is interpretable for explaining how biological pathways contribute to sexual dimorphism and improve risk prediction in an individual level, which can result in the development of precision medicine tailored to a specific individual's characteristics.


Subject(s)
Neural Networks, Computer , Sex Characteristics , Humans , Female , Male , Deep Learning , Neoplasms/genetics , Neoplasms/metabolism , Asthma/genetics , Genetic Predisposition to Disease
4.
Brief Bioinform ; 25(1)2023 11 22.
Article in English | MEDLINE | ID: mdl-37991247

ABSTRACT

The rapid growth of uncharacterized enzymes and their functional diversity urge accurate and trustworthy computational functional annotation tools. However, current state-of-the-art models lack trustworthiness on the prediction of the multilabel classification problem with thousands of classes. Here, we demonstrate that a novel evidential deep learning model (named ECPICK) makes trustworthy predictions of enzyme commission (EC) numbers with data-driven domain-relevant evidence, which results in significantly enhanced predictive power and the capability to discover potential new motif sites. ECPICK learns complex sequential patterns of amino acids and their hierarchical structures from 20 million enzyme data. ECPICK identifies significant amino acids that contribute to the prediction without multiple sequence alignment. Our intensive assessment showed not only outstanding enhancement of predictive performance on the largest databases of Uniprot, Protein Data Bank (PDB) and Kyoto Encyclopedia of Genes and Genomes (KEGG), but also a capability to discover new motif sites in microorganisms. ECPICK is a reliable EC number prediction tool to identify protein functions of an increasing number of uncharacterized enzymes.


Subject(s)
Deep Learning , Proteins/chemistry , Databases, Protein , Genome , Amino Acids
5.
Gerontol Geriatr Med ; 9: 23337214231189053, 2023.
Article in English | MEDLINE | ID: mdl-37529374

ABSTRACT

Telehealth has been widely accepted as an alternative to in-person primary care. This study examines whether the quality of primary care delivered via telehealth is equitable for older adults across racial and ethnic boundaries in provider-shortage urban settings. The study analyzed documentation of the 4Ms components (What Matters, Mobility, Medication, and Mentation) in relation to self-reported racial and ethnic backgrounds of 254 Medicare Advantage enrollees who used telehealth as their primary care modality in Southern Nevada from July 2021 through June 2022. Results revealed that Asian/Hawaiian/Pacific Islanders had significantly less documentation in What Matters (OR = 0.39, 95%, p = .04) and Blacks had significantly less documentation in Mobility (OR = 0.35, p < .001) compared to their White counterparts. The Hispanic ethnic group had less documentation in What Matters (OR = 0.18, p < .001) compared to non-Hispanic ethnic groups. Our study reveals equipping the geriatrics workforce merely with the 4Ms framework may not be sufficient in mitigating unconscious biases healthcare providers exhibit in the telehealth primary care setting in a provider shortage area, and, by extrapolation, in other care settings across the spectra, whether they be in-person or virtual.

6.
Arch Pharm Res ; 46(6): 535-549, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37261600

ABSTRACT

The relevant study of transcriptome-wide variations and neurological disorders in the evolved field of genomic data science is on the rise. Deep learning has been highlighted utilizing algorithms on massive amounts of data in a human-like manner, and is expected to predict the dependency or druggability of hidden mutations within the genome. Enormous mutational variants in coding and noncoding transcripts have been discovered along the genome by far, despite of the fine-tuned genetic proofreading machinery. These variants could be capable of inducing various pathological conditions, including neurological disorders, which require lifelong care. Several limitations and questions emerge, including the use of conventional processes via limited patient-driven sequence acquisitions and decoding-based inferences as well as how rare variants can be deduced as a population-specific etiology. These puzzles require harnessing of advanced systems for precise disease prediction, drug development and drug applications. In this review, we summarize the pathophysiological discoveries of pathogenic variants in both coding and noncoding transcripts in neurological disorders, and the current advantage of deep learning applications. In addition, we discuss the challenges encountered and how to outperform them with advancing interpretation.


Subject(s)
Deep Learning , Nervous System Diseases , Humans , Mutation , Transcriptome , Nervous System Diseases/drug therapy , Nervous System Diseases/genetics
7.
Article in English | MEDLINE | ID: mdl-37372743

ABSTRACT

Telehealth has been adopted as an alternative to in-person primary care visits. With multiple participants able to join remotely, telehealth can facilitate the discussion and documentation of advance care planning (ACP) for those with Alzheimer's disease-related disorders (ADRDs). We measured hospitalization-associated utilization outcomes, instances of hospitalization and 90-day re-hospitalizations from payors' administrative databases and verified the data via electronic health records. We estimated the hospitalization-associated costs using the Nevada State Inpatient Dataset and compared the estimated costs between ADRD patients with and without ACP documentation in the year 2021. Compared to the ADRD patients without ACP documentation, those with ACP documentation were less likely to be hospitalized (mean: 0.74; standard deviation: 0.31; p < 0.01) and were less likely to be readmitted within 90 days of discharge (mean: 0.16; standard deviation: 0.06; p < 0.01). The hospitalization-associated cost estimate for ADRD patients with ACP documentation (mean: USD 149,722; standard deviation: USD 80,850) was less than that of the patients without ACP documentation (mean: USD 200,148; standard deviation: USD 82,061; p < 0.01). Further geriatrics workforce training is called for to enhance ACP competencies for ADRD patients, especially in areas with provider shortages where telehealth plays a comparatively more important role.


Subject(s)
Advance Care Planning , Alzheimer Disease , Hospitalization , Primary Health Care , Telemedicine , Humans , Alzheimer Disease/therapy , Health Care Costs , Retrospective Studies , Cross-Sectional Studies , Hospitalization/economics , Hospitalization/statistics & numerical data , Middle Aged , Aged , Aged, 80 and over , Male
8.
Brain ; 146(4): 1267-1280, 2023 04 19.
Article in English | MEDLINE | ID: mdl-36448305

ABSTRACT

Phospholipase C (PLC) is an essential isozyme involved in the phosphoinositide signalling pathway, which maintains cellular homeostasis. Gain- and loss-of-function mutations in PLC affect enzymatic activity and are therefore associated with several disorders. Alternative splicing variants of PLC can interfere with complex signalling networks associated with oncogenic transformation and other diseases, including brain disorders. Cells and tissues with various mutations in PLC contribute different phosphoinositide signalling pathways and disease progression, however, identifying cryptic mutations in PLC remains challenging. Herein, we review both the mechanisms underlying PLC regulation of the phosphoinositide signalling pathway and the genetic variation of PLC in several brain disorders. In addition, we discuss the present challenges associated with the potential of deep-learning-based analysis for the identification of PLC mutations in brain disorders.


Subject(s)
Brain Diseases , Deep Learning , Humans , Type C Phospholipases/genetics , Type C Phospholipases/metabolism , Phosphoinositide Phospholipase C/genetics , Phosphoinositide Phospholipase C/metabolism , Phosphatidylinositols/metabolism , Mutation/genetics
9.
Am J Phys Med Rehabil ; 102(4): 353-359, 2023 04 01.
Article in English | MEDLINE | ID: mdl-36095159

ABSTRACT

OBJECTIVE: The aim of the study is to evaluate opioid analgesic utilization and predictors for adverse events during hospitalization and discharge disposition among patients admitted with osteoarthritis or spine disorders. DESIGN: This is a retrospective study of 12,747 adult patients admitted to six private community hospitals from 2017 to 2020. Opioid use during hospitalization and risk factors for hospital-acquired adverse events and nonhome discharge were investigated. RESULTS: The total number of patients using opioids decreased; however, the daily morphine milligram equivalent use for patients on opioids increased from 2017 to 2020. Increased odds of nonhome discharge were associated with older age, Medicaid, Medicare insurance, and increased lengths of stay, increased body mass index, daily morphine milligram equivalent, and electrolyte replacement in the osteoarthritis group. In the spine group, older age, Black race, Medicaid, Medicare, no insurance, increased Charlson Comorbidity Index, lengths of stay, polypharmacy, and heparin use were associated with nonhome discharge. Adverse events were associated with increased age, lengths of stay, Medicare, polypharmacy, antiemetic, and benzodiazepine use in the osteoarthritis group and increased Charlson Comorbidity Index, lengths of stay, and electrolyte replacement in the spine group. CONCLUSIONS: Despite the decreasing number of patients using opioids over the years, patients on opioids had an increased daily morphine milligram equivalent over the same period.


Subject(s)
Analgesics, Opioid , Osteoarthritis , Adult , Humans , Aged , United States , Analgesics, Opioid/therapeutic use , Retrospective Studies , Inpatients , Medicare , Hospitalization , Hospitals , Osteoarthritis/drug therapy , Electrolytes , Morphine Derivatives
10.
PLoS One ; 17(12): e0278570, 2022.
Article in English | MEDLINE | ID: mdl-36455001

ABSTRACT

High-dimensional LASSO (Hi-LASSO) is a powerful feature selection tool for high-dimensional data. Our previous study showed that Hi-LASSO outperformed the other state-of-the-art LASSO methods. However, the substantial cost of bootstrapping and the lack of experiments for a parametric statistical test for feature selection have impeded to apply Hi-LASSO for practical applications. In this paper, the Python package and its Spark library are efficiently designed in a parallel manner for practice with real-world problems, as well as providing the capability of the parametric statistical tests for feature selection on high-dimensional data. We demonstrate Hi-LASSO's outperformance with various intensive experiments in a practical manner. Hi-LASSO will be efficiently and easily performed by using the packages for feature selection. Hi-LASSO packages are publicly available at https://github.com/datax-lab/Hi-LASSO under the MIT license. The packages can be easily installed by Python PIP, and additional documentation is available at https://pypi.org/project/hi-lasso and https://pypi.org/project/Hi-LASSO-spark.


Subject(s)
Drug Packaging , Libraries , APACHE , Gene Library
11.
Sci Rep ; 12(1): 19075, 2022 11 09.
Article in English | MEDLINE | ID: mdl-36351997

ABSTRACT

Digital pathology coupled with advanced machine learning (e.g., deep learning) has been changing the paradigm of whole-slide histopathological images (WSIs) analysis. Major applications in digital pathology using machine learning include automatic cancer classification, survival analysis, and subtyping from pathological images. While most pathological image analyses are based on patch-wise processing due to the extremely large size of histopathology images, there are several applications that predict a single clinical outcome or perform pathological diagnosis per slide (e.g., cancer classification, survival analysis). However, current slide-based analyses are task-dependent, and a general framework of slide-based analysis in WSI has been seldom investigated. We propose a novel slide-based histopathology analysis framework that creates a WSI representation map, called HipoMap, that can be applied to any slide-based problems, coupled with convolutional neural networks. HipoMap converts a WSI of various shapes and sizes to structured image-type representation. Our proposed HipoMap outperformed existing methods in intensive experiments with various settings and datasets. HipoMap showed the Area Under the Curve (AUC) of 0.96±0.026 (5% improved) in the experiments for lung cancer classification, and c-index of 0.787±0.013 (3.5% improved) and coefficient of determination ([Formula: see text]) of 0.978±0.032 (24% improved) in survival analysis and survival prediction with TCGA lung cancer data respectively, as a general framework of slide-based analysis with a flexible capability. The results showed significant improvement comparing to the current state-of-the-art methods on each task. We further discussed experimental results of HipoMap as pathological viewpoints and verified the performance using publicly available TCGA datasets. A Python package is available at https://pypi.org/project/hipomap , and the package can be easily installed using Python PIP. The open-source codes in Python are available at: https://github.com/datax-lab/HipoMap .


Subject(s)
Deep Learning , Lung Neoplasms , Humans , Neural Networks, Computer , Image Processing, Computer-Assisted/methods , Machine Learning
12.
Article in English | MEDLINE | ID: mdl-35984789

ABSTRACT

COVID-19 vaccine distribution route directly impacts the community's mortality and infection rate. Therefore, optimal vaccination dissemination would appreciably lower the death and infection rates. This paper proposes the Epidemic Vulnerability Index (EVI) that quantitatively evaluates the subject's potential risk. Our primary aim for the suggested index is to diminish both infection rate and death rate efficiently. EVI was accordingly designed with clinical factors determining the mortality and social factors incorporating the infection rate. Through statistical COVID-19 patient dataset analysis and social network analysis with an agent-based model that is analogous to a real-world system, we define and experimentally validate the capability of EVI. Our experiments consist of nine vaccination distribution scenarios, including existing indexes which estimate the risk and stochastically proliferate the contagion and vaccine in a 300,000 agent-based graph network. We compared the outcome and variation of the three metrics in the experiments: infection case, death case, and death rate. Through this assessment, vaccination by the descending order of EVI has shown to have a significant outcome with an average of 5.0% lower infection cases, 9.4% lower death cases, and 3.5% lower death rate than other vaccine distribution routes.

13.
Int J Mol Sci ; 23(12)2022 Jun 14.
Article in English | MEDLINE | ID: mdl-35743052

ABSTRACT

In recent years, deep learning has emerged as a highly active research field, achieving great success in various machine learning areas, including image processing, speech recognition, and natural language processing, and now rapidly becoming a dominant tool in biomedicine [...].


Subject(s)
Computational Biology , Deep Learning , Image Processing, Computer-Assisted/methods , Machine Learning , Natural Language Processing
14.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34791014

ABSTRACT

High-throughput next-generation sequencing now makes it possible to generate a vast amount of multi-omics data for various applications. These data have revolutionized biomedical research by providing a more comprehensive understanding of the biological systems and molecular mechanisms of disease development. Recently, deep learning (DL) algorithms have become one of the most promising methods in multi-omics data analysis, due to their predictive performance and capability of capturing nonlinear and hierarchical features. While integrating and translating multi-omics data into useful functional insights remain the biggest bottleneck, there is a clear trend towards incorporating multi-omics analysis in biomedical research to help explain the complex relationships between molecular layers. Multi-omics data have a role to improve prevention, early detection and prediction; monitor progression; interpret patterns and endotyping; and design personalized treatments. In this review, we outline a roadmap of multi-omics integration using DL and offer a practical perspective into the advantages, challenges and barriers to the implementation of DL in multi-omics data.


Subject(s)
Deep Learning , Genomics , Algorithms , High-Throughput Nucleotide Sequencing
15.
Bioinformatics ; 37(Suppl_1): i443-i450, 2021 07 12.
Article in English | MEDLINE | ID: mdl-34252964

ABSTRACT

MOTIVATION: Convolutional neural networks (CNNs) have achieved great success in the areas of image processing and computer vision, handling grid-structured inputs and efficiently capturing local dependencies through multiple levels of abstraction. However, a lack of interpretability remains a key barrier to the adoption of deep neural networks, particularly in predictive modeling of disease outcomes. Moreover, because biological array data are generally represented in a non-grid structured format, CNNs cannot be applied directly. RESULTS: To address these issues, we propose a novel method, called PathCNN, that constructs an interpretable CNN model on integrated multi-omics data using a newly defined pathway image. PathCNN showed promising predictive performance in differentiating between long-term survival (LTS) and non-LTS when applied to glioblastoma multiforme (GBM). The adoption of a visualization tool coupled with statistical analysis enabled the identification of plausible pathways associated with survival in GBM. In summary, PathCNN demonstrates that CNNs can be effectively applied to multi-omics data in an interpretable manner, resulting in promising predictive power while identifying key biological correlates of disease. AVAILABILITY AND IMPLEMENTATION: The source code is freely available at: https://github.com/mskspi/PathCNN.


Subject(s)
Glioblastoma , Humans , Image Processing, Computer-Assisted , Neural Networks, Computer , Software
16.
Sci Adv ; 7(21)2021 05.
Article in English | MEDLINE | ID: mdl-34138732

ABSTRACT

Bromodomain and extraterminal proteins (BET) are epigenetic readers that play critical roles in gene regulation. Pharmacologic inhibition of the bromodomain present in all BET family members is a promising therapeutic strategy for various diseases, but its impact on individual family members has not been well understood. Using a transcriptional induction paradigm in neurons, we have systematically demonstrated that three major BET family proteins (BRD2/3/4) participated in transcription with different recruitment kinetics, interdependency, and sensitivity to a bromodomain inhibitor, JQ1. In a mouse model of fragile X syndrome (FXS), BRD2/3 and BRD4 showed oppositely altered expression and chromatin binding, correlating with transcriptional dysregulation. Acute inhibition of CBP/p300 histone acetyltransferase (HAT) activity restored the altered binding patterns of BRD2 and BRD4 and rescued memory impairment in FXS. Our study emphasizes the importance of understanding the BET coordination controlled by a balanced action between HATs with different substrate specificity.


Subject(s)
Fragile X Syndrome , Nuclear Proteins , Animals , Fragile X Syndrome/genetics , Gene Expression Regulation , Histones/metabolism , Mice , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism
17.
Proc Natl Acad Sci U S A ; 118(3)2021 01 19.
Article in English | MEDLINE | ID: mdl-33397809

ABSTRACT

Exon splicing triggered by unpredicted genetic mutation can cause translational variations in neurodegenerative disorders. In this study, we discover Alzheimer's disease (AD)-specific single-nucleotide variants (SNVs) and abnormal exon splicing of phospholipase c gamma-1 (PLCγ1) gene, using genome-wide association study (GWAS) and a deep learning-based exon splicing prediction tool. GWAS revealed that the identified single-nucleotide variations were mainly distributed in the H3K27ac-enriched region of PLCγ1 gene body during brain development in an AD mouse model. A deep learning analysis, trained with human genome sequences, predicted 14 splicing sites in human PLCγ1 gene, and one of these completely matched with an SNV in exon 27 of PLCγ1 gene in an AD mouse model. In particular, the SNV in exon 27 of PLCγ1 gene is associated with abnormal splicing during messenger RNA maturation. Taken together, our findings suggest that this approach, which combines in silico and deep learning-based analyses, has potential for identifying the clinical utility of critical SNVs in AD prediction.


Subject(s)
Alzheimer Disease/genetics , Deep Learning , Genetic Predisposition to Disease , Phospholipase C gamma/genetics , Alzheimer Disease/pathology , Animals , Computer Simulation , Disease Models, Animal , Exons/genetics , Genome, Human , Genome-Wide Association Study , High-Throughput Screening Assays , Humans , Mice , Polymorphism, Single Nucleotide/genetics , RNA Splicing/genetics , RNA, Messenger/genetics
19.
Methods ; 179: 3-13, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32442672

ABSTRACT

Digitizing whole-slide imaging in digital pathology has led to the advancement of computer-aided tissue examination using machine learning techniques, especially convolutional neural networks. A number of convolutional neural network-based methodologies have been proposed to accurately analyze histopathological images for cancer detection, risk prediction, and cancer subtype classification. Most existing methods have conducted patch-based examinations, due to the extremely large size of histopathological images. However, patches of a small window often do not contain sufficient information or patterns for the tasks of interest. It corresponds that pathologists also examine tissues at various magnification levels, while checking complex morphological patterns in a microscope. We propose a novel multi-task based deep learning model for HIstoPatholOgy (named Deep-Hipo) that takes multi-scale patches simultaneously for accurate histopathological image analysis. Deep-Hipo extracts two patches of the same size in both high and low magnification levels, and captures complex morphological patterns in both large and small receptive fields of a whole-slide image. Deep-Hipo has outperformed the current state-of-the-art deep learning methods. We assessed the proposed method in various types of whole-slide images of the stomach: well-differentiated, moderately-differentiated, and poorly-differentiated adenocarcinoma; poorly cohesive carcinoma, including signet-ring cell features; and normal gastric mucosa. The optimally trained model was also applied to histopathological images of The Cancer Genome Atlas (TCGA), Stomach Adenocarcinoma (TCGA-STAD) and TCGA Colon Adenocarcinoma (TCGA-COAD), which show similar pathological patterns with gastric carcinoma, and the experimental results were clinically verified by a pathologist. The source code of Deep-Hipo is publicly available athttp://dataxlab.org/deep-hipo.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Pathology, Clinical/methods , Adenocarcinoma/diagnosis , Adenocarcinoma/pathology , Colonic Neoplasms/diagnosis , Colonic Neoplasms/pathology , Gastric Mucosa/pathology , Humans , Intestinal Mucosa/pathology , Stomach Neoplasms/diagnosis , Stomach Neoplasms/pathology
20.
Methods ; 173: 24-31, 2020 02 15.
Article in English | MEDLINE | ID: mdl-31247294

ABSTRACT

Cancer is a genetic disease comprising multiple subtypes that have distinct molecular characteristics and clinical features. Cancer subtyping helps in improving personalized treatment and making decision, as different cancer subtypes respond differently to the treatment. The increasing availability of cancer related genomic data provides the opportunity to identify molecular subtypes. Several unsupervised machine learning techniques have been applied on molecular data of the tumor samples to identify cancer subtypes that are genetically and clinically distinct. However, most clustering methods often fail to efficiently cluster patients due to the challenges imposed by high-throughput genomic data and its non-linearity. In this paper, we propose a pathway-based deep clustering method (PACL) for molecular subtyping of cancer, which incorporates gene expression and biological pathway database to group patients into cancer subtypes. The main contribution of our model is to discover high-level representations of biological data by learning complex hierarchical and nonlinear effects of pathways. We compared the performance of our model with a number of benchmark clustering methods that recently have been proposed in cancer subtypes. We assessed the hypothesis that clusters (subtypes) may be associated to different survivals by logrank tests. PACL showed the lowest p-value of the logrank test against the benchmark methods. It demonstrates the patient groups clustered by PACL may correspond to subtypes which are significantly associated with distinct survival distributions. Moreover, PACL provides a solution to comprehensively identify subtypes and interpret the model in the biological pathway level. The open-source software of PACL in PyTorch is publicly available at https://github.com/tmallava/PACL.


Subject(s)
Computational Biology/methods , Genomics/methods , Metabolic Networks and Pathways/genetics , Neoplasms/classification , Algorithms , Cluster Analysis , Humans , Neoplasms/genetics , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...