Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 38
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2023 Dec 04.
Article in English | MEDLINE | ID: mdl-38045324

ABSTRACT

Alzheimer's disease (AD) is a neurodegenerative disorder, and timely diagnosis is crucial for early interventions. AD is known to have disruptive local and global brain neural connections that may be instrumental in understanding and extracting specific biomarkers. Previous machine-learning approaches are mostly based on convolutional neural network (CNN) and standard vision transformer (ViT) models which may not sufficiently capture the multidimensional local and global patterns that may be indicative of AD. Therefore, in this paper, we propose a novel approach called PVTAD to classify AD and cognitively normal (CN) cases using pretrained pyramid vision transformer (PVT) and white matter (WM) of T1-weighted structural MRI (sMRI) data. Our approach combines the advantages of CNN and standard ViT to extract both local and global features indicative of AD from the WM coronal middle slices. We performed experiments on subjects with T1-weighed MPRAGE sMRI scans from the ADNI dataset. Our results demonstrate that the PVTAD achieves an average accuracy of 97.7% and F1-score of 97.6%, outperforming the single and parallel CNN and standard ViT architectures based on sMRI data for AD vs. CN classification.

2.
NAR Genom Bioinform ; 5(2): lqad063, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37680392

ABSTRACT

To pave the road towards precision medicine in cancer, patients with similar biology ought to be grouped into same cancer subtypes. Utilizing high-dimensional multiomics datasets, integrative approaches have been developed to uncover cancer subtypes. Recently, Graph Neural Networks have been discovered to learn node embeddings utilizing node features and associations on graph-structured data. Some integrative prediction tools have been developed leveraging these advances on multiple networks with some limitations. Addressing these limitations, we developed SUPREME, a node classification framework, which integrates multiple data modalities on graph-structured data. On breast cancer subtyping, unlike existing tools, SUPREME generates patient embeddings from multiple similarity networks utilizing multiomics features and integrates them with raw features to capture complementary signals. On breast cancer subtype prediction tasks from three datasets, SUPREME outperformed other tools. SUPREME-inferred subtypes had significant survival differences, mostly having more significance than ground truth, and outperformed nine other approaches. These results suggest that with proper multiomics data utilization, SUPREME could demystify undiscovered characteristics in cancer subtypes that cause significant survival differences and could improve ground truth label, which depends mainly on one datatype. In addition, to show model-agnostic property of SUPREME, we applied it to two additional datasets and had a clear outperformance.

3.
Bioinformatics ; 39(39 Suppl 1): i149-i157, 2023 06 30.
Article in English | MEDLINE | ID: mdl-37387135

ABSTRACT

MOTIVATION: Alzheimer's disease (AD) is a neurodegenerative disease that affects millions of people worldwide. Mild cognitive impairment (MCI) is an intermediary stage between cognitively normal state and AD. Not all people who have MCI convert to AD. The diagnosis of AD is made after significant symptoms of dementia such as short-term memory loss are already present. Since AD is currently an irreversible disease, diagnosis at the onset of the disease brings a huge burden on patients, their caregivers, and the healthcare sector. Thus, there is a crucial need to develop methods for the early prediction AD for patients who have MCI. Recurrent neural networks (RNN) have been successfully used to handle electronic health records (EHR) for predicting conversion from MCI to AD. However, RNN ignores irregular time intervals between successive events which occurs common in electronic health record data. In this study, we propose two deep learning architectures based on RNN, namely Predicting Progression of Alzheimer's Disease (PPAD) and PPAD-Autoencoder. PPAD and PPAD-Autoencoder are designed for early predicting conversion from MCI to AD at the next visit and multiple visits ahead for patients, respectively. To minimize the effect of the irregular time intervals between visits, we propose using age in each visit as an indicator of time change between successive visits. RESULTS: Our experimental results conducted on Alzheimer's Disease Neuroimaging Initiative and National Alzheimer's Coordinating Center datasets showed that our proposed models outperformed all baseline models for most prediction scenarios in terms of F2 and sensitivity. We also observed that the age feature was one of top features and was able to address irregular time interval problem. AVAILABILITY AND IMPLEMENTATION: https://github.com/bozdaglab/PPAD.


Subject(s)
Alzheimer Disease , Cognitive Dysfunction , Deep Learning , Neurodegenerative Diseases , Humans , Alzheimer Disease/diagnostic imaging , Cognitive Dysfunction/diagnostic imaging , Electronic Health Records
4.
ACS Omega ; 8(23): 20379-20388, 2023 Jun 13.
Article in English | MEDLINE | ID: mdl-37323377

ABSTRACT

The nuclear receptor (NR) superfamily includes phylogenetically related ligand-activated proteins, which play a key role in various cellular activities. NR proteins are subdivided into seven subfamilies based on their function, mechanism, and nature of the interacting ligand. Developing robust tools to identify NR could give insights into their functional relationships and involvement in disease pathways. Existing NR prediction tools only use a few types of sequence-based features and are tested on relatively similar independent datasets; thus, they may suffer from overfitting when extended to new genera of sequences. To address this problem, we developed Nuclear Receptor Prediction Tool (NRPreTo), a two-level NR prediction tool with a unique training approach where in addition to the sequence-based features used by existing NR prediction tools, six additional feature groups depicting various physiochemical, structural, and evolutionary features of proteins were utilized. The first level of NRPreTo allows for the successful prediction of a query protein as NR or non-NR and further subclassifies the protein into one of the seven NR subfamilies in the second level. We developed Random Forest classifiers to test on benchmark datasets, as well as the entire human protein datasets from RefSeq and Human Protein Reference Database (HPRD). We observed that using additional feature groups improved the performance. We also observed that NRPreTo achieved high performance on the external datasets and predicted 59 novel NRs in the human proteome. The source code of NRPreTo is publicly available at https://github.com/bozdaglab/NRPreTo.

5.
bioRxiv ; 2023 Jan 31.
Article in English | MEDLINE | ID: mdl-36778453

ABSTRACT

Alzheimer's disease (AD) is a neurodegenerative disease that affects millions of people worldwide. Mild cognitive impairment (MCI) is an intermediary stage between cognitively normal (CN) state and AD. Not all people who have MCI convert to AD. The diagnosis of AD is made after significant symptoms of dementia such as short-term memory loss are already present. Since AD is currently an irreversible disease, diagnosis at the onset of disease brings a huge burden on patients, their caregivers, and the healthcare sector. Thus, there is a crucial need to develop methods for the early prediction AD for patients who have MCI. Recurrent Neural Networks (RNN) have been successfully used to handle Electronic Health Records (EHR) for predicting conversion from MCI to AD. However, RNN ignores irregular time intervals between successive events which occurs common in EHR data. In this study, we propose two deep learning architectures based on RNN, namely Predicting Progression of Alzheimer's Disease (PPAD) and PPAD-Autoencoder (PPAD-AE). PPAD and PPAD-AE are designed for early predicting conversion from MCI to AD at the next visit and multiple visits ahead for patients, respectively. To minimize the effect of the irregular time intervals between visits, we propose using age in each visit as an indicator of time change between successive visits. Our experimental results conducted on Alzheimer's Disease Neuroimaging Initiative (ADNI) and National Alzheimer's Coordinating Center (NACC) datasets showed that our proposed models outperformed all baseline models for most prediction scenarios in terms of F2 and sensitivity. We also observed that the age feature was one of top features and was able to address irregular time interval problem.

6.
Sci Rep ; 12(1): 3717, 2022 03 08.
Article in English | MEDLINE | ID: mdl-35260634

ABSTRACT

DNA copy number aberrated regions in cancer are known to harbor cancer driver genes and the short non-coding RNA molecules, i.e., microRNAs. In this study, we integrated the multi-omics datasets such as copy number aberration, DNA methylation, gene and microRNA expression to identify the signature microRNA-gene associations from frequently aberrated DNA regions across pan-cancer utilizing a LASSO-based regression approach. We studied 7294 patient samples associated with eighteen different cancer types from The Cancer Genome Atlas (TCGA) database and identified several cancer-specific and common microRNA-gene interactions enriched in experimentally validated microRNA-target interactions. We highlighted several oncogenic and tumor suppressor microRNAs that were cancer-specific and common in several cancer types. Our method substantially outperformed the five state-of-art methods in selecting significantly known microRNA-gene interactions in multiple cancer types. Several microRNAs and genes were found to be associated with tumor survival and progression. Selected target genes were found to be significantly enriched in cancer-related pathways, cancer hallmark and Gene Ontology (GO) terms. Furthermore, subtype-specific potential gene signatures were discovered in multiple cancer types.


Subject(s)
MicroRNAs , Neoplasms , DNA Methylation , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , Neoplasms/genetics , Oncogenes
7.
IEEE/ACM Trans Comput Biol Bioinform ; 19(5): 2950-2962, 2022.
Article in English | MEDLINE | ID: mdl-34283720

ABSTRACT

Uncovering genotype-phenotype relationships is a fundamental challenge in genomics. Gene prioritization is an important step for this endeavor to make a short manageable list from a list of thousands of genes coming from high-throughput studies. Network propagation methods are promising and state of the art methods for gene prioritization based on the premise that functionally related genes tend to be close to each other in the biological networks. Recently, we introduced PhenoGeneRanker, a network-propagation algorithm for multiplex heterogeneous networks. PhenoGeneRanker allows multi-layer gene and phenotype networks. It also calculates empirical p values of gene and phenotype ranks using random stratified sampling of seeds of genes and phenotypes based on their connectivity degree in the network. In this study, we introduce the PhenoGeneRanker Bioconductor package and its application to multi-omics rat genome datasets to rank hypertension disease-related genes and strains. We showed that PhenoGeneRanker performed better to rank hypertension disease-related genes using multiplex gene networks than aggregated gene networks. We also showed that PhenoGeneRanker performed better to rank hypertension disease-related strains using multiplex phenotype network than single or aggregated phenotype networks. We performed a rigorous hyperparameter analysis and, finally showed that Gene Ontology (GO) enrichment of statistically significant top-ranked genes resulted in hypertension disease-related GO terms.


Subject(s)
Algorithms , Hypertension , Animals , Gene Regulatory Networks/genetics , Genomics/methods , Phenotype , Rats
8.
J Arthroplasty ; 37(4): 668-673, 2022 04.
Article in English | MEDLINE | ID: mdl-34954019

ABSTRACT

BACKGROUND: There have been efforts to reduce adverse events and unplanned readmissions after total joint arthroplasty. The Rothman Index (RI) is a real-time, composite measure of medical acuity for hospitalized patients. We aimed to examine the association among in-hospital RI scores and complications, readmissions, and discharge location after total knee arthroplasty (TKA). We hypothesized that RI scores could be used to predict the outcomes of interest. METHODS: This is a retrospective study of an institutional database of elective, primary TKA from July 2018 until December 2019. Complications and readmissions were defined per Centers for Medicare and Medicaid Services. Analysis included multivariate regression, computation of the area under the curve (AUC), and the Youden Index to set RI thresholds. RESULTS: The study cohort's (n = 957) complications (2.4%), readmissions (3.6%), and nonhome discharge (13.7%) were reported. All RI metrics (minimum, maximum, last, mean, range, 25th%, and 75th%) were significantly associated with increased odds of readmission and home discharge (all P < .05). RI scores were not significantly associated with complications. The optimal RI thresholds for increased risk of readmission were last ≤ 71 (AUC = 0.65), mean ≤ 67 (AUC = 0.66), or maximum ≤ 80 (AUC = 0.63). The optimal RI thresholds for increased risk of home discharge were minimum ≥ 53 (AUC = 0.65), mean ≥ 69 (AUC = 0.65), or maximum ≥ 81 (AUC = 0.60). CONCLUSION: RI values may be used to predict readmission or home discharge after TKA.


Subject(s)
Arthroplasty, Replacement, Hip , Arthroplasty, Replacement, Knee , Aftercare , Aged , Arthroplasty, Replacement, Hip/adverse effects , Arthroplasty, Replacement, Knee/adverse effects , Hospitals , Humans , Medicare , Patient Discharge , Patient Readmission , Postoperative Complications/epidemiology , Postoperative Complications/etiology , Retrospective Studies , Risk Factors , United States/epidemiology
9.
J Arthroplasty ; 37(3): 414-418, 2022 03.
Article in English | MEDLINE | ID: mdl-34793857

ABSTRACT

BACKGROUND: Identifying risk factors for adverse outcomes and increased costs following total joint arthroplasty (TJA) is needed to ensure quality. The interaction between pre-operative healthcare utilization (pre-HU) and outcomes following TJA has not been fully characterized. METHODS: This is a retrospective cohort study of patients undergoing elective, primary total hip arthroplasty (THA, N = 1785) or total knee arthroplasty (TKA, N = 2159) between 2015 and 2019 at a single institution. Pre-HU and post-operative healthcare utilization (post-HU) included non-elective healthcare utilization in the 90 days prior to and following TJA, respectively (emergency department, urgent care, observation admission, inpatient admission). Multivariate regression models including age, gender, American Society of Anesthesiologists, Medicaid status, and body mass index were fit for 30-day readmission, Centers for Medicare and Medicaid services (CMS)-defined complications, length of stay, and post-HU. RESULTS: The 30-day readmission rate was 3.2% and 3.4% and the CMS-defined complication rate was 3.8% and 2.9% for THA and TKA, respectively. Multivariate regression showed that for THA, presence of any pre-HU was associated with increased risk of 30-day readmission (odds ratio [OR] 2.85, 95% confidence interval [CI] 1.48-5.50, P = .002), CMS complications (OR 2.42, 95% CI 1.27-4.59, P = .007), and post-HU (OR 3.65, 95% CI 2.54-5.26, P < .001). For TKA, ≥2 pre-HU events were associated with increased risk of 30-day readmission (OR 3.52, 95% CI 1.17-10.61, P = .026) and post-HU (OR 2.64, 95% CI 1.29-5.40, P = .008). There were positive correlations for THA (any pre-HU) and TKA (≥2 pre-HU) with length of stay and number of post-HU events. CONCLUSION: Patients who utilize non-elective healthcare in the 90 days prior to TJA are at increased risk of readmission, complications, and unplanned post-HU. LEVEL OF EVIDENCE: Level III.


Subject(s)
Arthroplasty, Replacement, Hip , Patient Readmission , Aged , Arthroplasty, Replacement, Hip/adverse effects , Humans , Length of Stay , Medicare , Patient Acceptance of Health Care , Postoperative Complications/etiology , Retrospective Studies , Risk Factors , United States/epidemiology
10.
PLoS One ; 16(5): e0251399, 2021.
Article in English | MEDLINE | ID: mdl-33983999

ABSTRACT

To understand driving biological factors for complex diseases like cancer, regulatory circuity of genes needs to be discovered. Recently, a new gene regulation mechanism called competing endogenous RNA (ceRNA) interactions has been discovered. Certain genes targeted by common microRNAs (miRNAs) "compete" for these miRNAs, thereby regulate each other by making others free from miRNA regulation. Several computational tools have been published to infer ceRNA networks. In most existing tools, however, expression abundance sufficiency, collective regulation, and groupwise effect of ceRNAs are not considered. In this study, we developed a computational tool named Crinet to infer genome-wide ceRNA networks addressing critical drawbacks. Crinet considers all mRNAs, lncRNAs, and pseudogenes as potential ceRNAs and incorporates a network deconvolution method to exclude the spurious ceRNA pairs. We tested Crinet on breast cancer data in TCGA. Crinet inferred reproducible ceRNA interactions and groups, which were significantly enriched in the cancer-related genes and processes. We validated the selected miRNA-target interactions with the protein expression-based benchmarks and also evaluated the inferred ceRNA interactions predicting gene expression change in knockdown assays. The hub genes in the inferred ceRNA network included known suppressor/oncogene lncRNAs in breast cancer showing the importance of non-coding RNA's inclusion for ceRNA inference. Crinet-inferred ceRNA groups that were consistently involved in the immune system related processes could be important assets in the light of the studies confirming the relation between immunotherapy and cancer. The source code of Crinet is in R and available at https://github.com/bozdaglab/crinet.


Subject(s)
Gene Regulatory Networks , MicroRNAs/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Gene Expression Regulation, Neoplastic , Genomics/methods , Humans , Neoplasms/genetics
11.
PLoS One ; 15(6): e0234557, 2020.
Article in English | MEDLINE | ID: mdl-32555660

ABSTRACT

After mating, female mosquitoes need animal blood to develop their eggs. In the process of acquiring blood, they may acquire pathogens, which may cause different diseases in humans such as malaria, zika, dengue, and chikungunya. Therefore, knowing the parity status of mosquitoes is useful in control and evaluation of infectious diseases transmitted by mosquitoes, where parous mosquitoes are assumed to be potentially infectious. Ovary dissections, which are currently used to determine the parity status of mosquitoes, are very tedious and limited to few experts. An alternative to ovary dissections is near-infrared spectroscopy (NIRS), which can estimate the age in days and the infectious state of laboratory and semi-field reared mosquitoes with accuracies between 80 and 99%. No study has tested the accuracy of NIRS for estimating the parity status of wild mosquitoes. In this study, we train an artificial neural network (ANN) models on NIR spectra to estimate the parity status of wild mosquitoes. We use four different datasets: An. arabiensis collected from Minepa, Tanzania (Minepa-ARA); An. gambiae s.s collected from Muleba, Tanzania (Muleba-GA); An. gambiae s.s collected from Burkina Faso (Burkina-GA); and An.gambiae s.s from Muleba and Burkina Faso combined (Muleba-Burkina-GA). We train ANN models on datasets with spectra preprocessed according to previous protocols. We then use autoencoders to reduce the spectra feature dimensions from 1851 to 10 and re-train the ANN models. Before the autoencoder was applied, ANN models estimated parity status of mosquitoes in Minepa-ARA, Muleba-GA, Burkina-GA and Muleba-Burkina-GA with out-of-sample accuracies of 81.9±2.8 (N = 274), 68.7±4.8 (N = 43), 80.3±2.0 (N = 48), and 75.7±2.5 (N = 91), respectively. With the autoencoder, ANN models tested on out-of-sample data achieved 97.1±2.2% (N = 274), 89.8 ± 1.7% (N = 43), 93.3±1.2% (N = 48), and 92.7±1.8% (N = 91) accuracies for Minepa-ARA, Muleba-GA, Burkina-GA, and Muleba-Burkina-GA, respectively. These results show that a combination of an autoencoder and an ANN trained on NIR spectra to estimate the parity status of wild mosquitoes yields models that can be used as an alternative tool to estimate parity status of wild mosquitoes, especially since NIRS is a high-throughput, reagent-free, and simple-to-use technique compared to ovary dissections.


Subject(s)
Anopheles/physiology , Malaria/transmission , Mosquito Vectors/physiology , Neural Networks, Computer , Oviparity , Spectroscopy, Near-Infrared/methods , Animals , Female , Humans
12.
Article in English | MEDLINE | ID: mdl-34584774

ABSTRACT

Complex diseases such as hypertension, cancer, and diabetes cause nearly 70% of the deaths in the U.S. and involve multiple genes and their interactions with environmental factors. Therefore, identification of genetic factors to understand and decrease the morbidity and mortality from complex diseases is an important and challenging task. With the generation of an unprecedented amount of multi-omics datasets, network-based methods have become popular to represent the multilayered complex molecular interactions. Particularly node embeddings, the low-dimensional representations of nodes in a network are utilized for gene function prediction. Integrated network analysis of multi-omics data alleviates the issues related to missing data and lack of context-specific datasets. Most of the node embedding methods, however, are unable to integrate multiple types of datasets from genes and phenotypes. To address this limitation, we developed a node embedding algorithm called Node Embeddings of Complex networks (NECo) that can utilize multilayered heterogeneous networks of genes and phenotypes. We evaluated the performance of NECo using genotypic and phenotypic datasets from rat (Rattus norvegicus) disease models to classify hypertension disease-related genes. Our method significantly outperformed the state-of-the-art node embedding methods, with AUC of 94.97% compared 85.98% in the second-best performer, and predicted genes not previously implicated in hypertension.

13.
BMC Bioinformatics ; 20(1): 115, 2019 Mar 06.
Article in English | MEDLINE | ID: mdl-30841846

ABSTRACT

BACKGROUND: RNA-seq, wherein RNA transcripts expressed in a sample are sequenced and quantified, has become a widely used technique to study disease and development. With RNA-seq, transcription abundance can be measured, differential expression genes between groups and functional enrichment of those genes can be computed. However, biological insights from RNA-seq are often limited by computational analysis and the enormous volume of resulting data, preventing facile and meaningful review and interpretation of gene expression profiles. Particularly, in cases where the samples under study exhibit uncontrolled variation, deeper analysis of functional enrichment would be necessary to visualize samples' gene expression activity under each biological function. RESULTS: We developed a Bioconductor package rgsepd that streamlines RNA-seq data analysis by wrapping commonly used tools DESeq2 and GOSeq in a user-friendly interface and performs a gene-subset linear projection to cluster heterogeneous samples by Gene Ontology (GO) terms. Rgsepd computes significantly enriched GO terms for each experimental condition and generates multidimensional projection plots highlighting how each predefined gene set's multidimensional expression may delineate samples. CONCLUSIONS: The rgsepd serves to automate differential expression, functional annotation, and exploratory data analyses to highlight subtle expression differences among samples based on each significant biological function.


Subject(s)
Sequence Analysis, RNA/methods , Software , Gene Ontology , Heart Atria/metabolism , Humans , RNA/genetics , RNA/metabolism
14.
PLoS Comput Biol ; 14(7): e1006318, 2018 07.
Article in English | MEDLINE | ID: mdl-30011266

ABSTRACT

MicroRNAs (miRNAs) inhibit expression of target genes by binding to their RNA transcripts. It has been recently shown that RNA transcripts targeted by the same miRNA could "compete" for the miRNA molecules and thereby indirectly regulate each other. Experimental evidence has suggested that the aberration of such miRNA-mediated interaction between RNAs-called competing endogenous RNA (ceRNA) interaction-can play important roles in tumorigenesis. Given the difficulty of deciphering context-specific miRNA binding, and the existence of various gene regulatory factors such as DNA methylation and copy number alteration, inferring context-specific ceRNA interactions accurately is a computationally challenging task. Here we propose a computational method called Cancerin to identify cancer-associated ceRNA interactions. Cancerin incorporates DNA methylation, copy number alteration, gene and miRNA expression datasets to construct cancer-specific ceRNA networks. We applied Cancerin to three cancer datasets from the Cancer Genome Atlas (TCGA) project. Our results indicated that ceRNAs were enriched with cancer-related genes, and ceRNA modules in the inferred ceRNA networks were involved in cancer-associated biological processes. Using LINCS-L1000 shRNA-mediated gene knockdown experiment in breast cancer cell line to assess accuracy, Cancerin was able to predict expression outcome of ceRNA genes with high accuracy.


Subject(s)
Breast Neoplasms/genetics , Computer Simulation , Gene Regulatory Networks , Genes, Neoplasm , RNA, Neoplasm/genetics , Atlases as Topic , Cell Line, Tumor , DNA Copy Number Variations , DNA Methylation , Datasets as Topic , Female , Gene Expression Regulation, Neoplastic , Humans , MicroRNAs/genetics , Neoplasm Proteins/metabolism , Prognosis , Protein Binding , RNA Processing, Post-Transcriptional
15.
PLoS One ; 12(9): e0184590, 2017.
Article in English | MEDLINE | ID: mdl-28880957

ABSTRACT

Dysregulation of MST1/STK4, a key kinase component of the Hippo-YAP pathway, is linked to the etiology of many cancers with poor prognosis. However, how STK4 restricts the emergence of aggressive cancer remains elusive. Here, we investigated the effects of STK4, primarily localized in the cytoplasm, lipid raft, and nucleus, on cell growth and gene expression in aggressive prostate cancer. We demonstrated that lipid raft and nuclear STK4 had superior suppressive effects on cell growth in vitro and in vivo compared with cytoplasmic STK4. Using RNA sequencing and bioinformatics analysis, we identified several differentially expressed (DE) genes that responded to ectopic STK4 in all three subcellular compartments. We noted that the number of DE genes observed in lipid raft and nuclear STK4 cells were much greater than cytoplasmic STK4. Our functional annotation clustering showed that these DE genes were commonly associated with oncogenic pathways such as AR, PI3K/AKT, BMP/SMAD, GPCR, WNT, and RAS as well as unique pathways such as JAK/STAT, which emerged only in nuclear STK4 cells. These findings indicate that MST1/STK4/Hippo signaling restricts aggressive tumor cell growth by intersecting with multiple molecular pathways, suggesting that targeting of the STK4/Hippo pathway may have important therapeutic implications for cancer.


Subject(s)
Prostatic Neoplasms/metabolism , Protein Serine-Threonine Kinases/metabolism , Animals , Cell Line, Tumor , Cell Nucleus/metabolism , Computational Biology , Cytoplasm/metabolism , Fluorescent Antibody Technique , Hippo Signaling Pathway , Humans , Intracellular Signaling Peptides and Proteins , Male , Mice , Prostate/metabolism , Prostate/pathology , Signal Transduction/genetics , Signal Transduction/physiology
16.
Genomics ; 109(3-4): 233-240, 2017 07.
Article in English | MEDLINE | ID: mdl-28438487

ABSTRACT

Copy number amplifications and deletions that are recurrent in cancer samples harbor genes that confer a fitness advantage to cancer tumor proliferation and survival. One important challenge in computational biology is to separate the causal (i.e., driver) genes from passenger genes in large, aberrated regions. Many previous studies focus on the genes within the aberration (i.e., cis genes), but do not utilize the genes that are outside of the aberrated region and dysregulated as a result of the aberration (i.e., trans genes). We propose a computational pipeline, called ProcessDriver, that prioritizes candidate drivers by relating cis genes to dysregulated trans genes and biological processes. ProcessDriver is based on the assumption that a driver cis gene should be closely associated with the dysregulated trans genes and biological processes, as opposed to previous studies that assume a driver cis gene should be the most correlated gene to the copy number of an aberrated region. We applied our method on breast, bladder and ovarian cancer data from the Cancer Genome Atlas database. Our results included previously known driver genes and cancer genes, as well as potentially novel driver genes. Additionally, many genes in the final set of drivers were linked to new tumor events after initial treatment using survival analysis. Our results highlight the importance of selecting driver genes based on their widespread downstream effects in trans.


Subject(s)
Breast Neoplasms/genetics , Gene Dosage , Genomics/methods , Oncogenes , Ovarian Neoplasms/genetics , Urinary Bladder Neoplasms/genetics , Algorithms , Breast Neoplasms/pathology , DNA Copy Number Variations , Disease Progression , Female , Humans , Ovarian Neoplasms/pathology , Urinary Bladder Neoplasms/pathology
17.
Plant J ; 89(5): 1042-1054, 2017 Mar.
Article in English | MEDLINE | ID: mdl-27775877

ABSTRACT

Cowpea (Vigna unguiculata L. Walp.) is a legume crop that is resilient to hot and drought-prone climates, and a primary source of protein in sub-Saharan Africa and other parts of the developing world. However, genome resources for cowpea have lagged behind most other major crops. Here we describe foundational genome resources and their application to the analysis of germplasm currently in use in West African breeding programs. Resources developed from the African cultivar IT97K-499-35 include a whole-genome shotgun (WGS) assembly, a bacterial artificial chromosome (BAC) physical map, and assembled sequences from 4355 BACs. These resources and WGS sequences of an additional 36 diverse cowpea accessions supported the development of a genotyping assay for 51 128 SNPs, which was then applied to five bi-parental RIL populations to produce a consensus genetic map containing 37 372 SNPs. This genetic map enabled the anchoring of 100 Mb of WGS and 420 Mb of BAC sequences, an exploration of genetic diversity along each linkage group, and clarification of macrosynteny between cowpea and common bean. The SNP assay enabled a diversity analysis of materials from West African breeding programs. Two major subpopulations exist within those materials, one of which has significant parentage from South and East Africa and more diversity. There are genomic regions of high differentiation between subpopulations, one of which coincides with a cluster of nodulin genes. The new resources and knowledge help to define goals and accelerate the breeding of improved varieties to address food security issues related to limited-input small-holder farming and climate stress.


Subject(s)
Crops, Agricultural/genetics , Crops, Agricultural/physiology , Vigna/genetics , Vigna/physiology , Chromosomes, Artificial, Bacterial , Chromosomes, Plant/genetics , Climate , Food Supply , Genome, Plant/genetics , Genotype
18.
PLoS One ; 11(2): e0148977, 2016.
Article in English | MEDLINE | ID: mdl-26872146

ABSTRACT

DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.


Subject(s)
Breast Neoplasms/genetics , DNA Methylation , Algorithms , Breast Neoplasms/metabolism , Cell Line, Tumor , DNA Probes , Epigenesis, Genetic , Female , Gene Expression , Gene Expression Regulation, Neoplastic , Gene Ontology , Humans , Multigene Family
19.
Plant J ; 84(1): 216-27, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26252423

ABSTRACT

Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant.


Subject(s)
Chromosomes, Artificial, Bacterial/genetics , Genome, Plant/genetics , Hordeum/genetics , Molecular Sequence Data
20.
PLoS One ; 10(7): e0133356, 2015.
Article in English | MEDLINE | ID: mdl-26207811

ABSTRACT

Coarctation of the aorta (CoA) is a constriction of the proximal descending thoracic aorta and is one of the most common congenital cardiovascular defects. Treatments for CoA improve life expectancy, but morbidity persists, particularly due to the development of chronic hypertension (HTN). Identifying the mechanisms of morbidity is difficult in humans due to confounding variables such as age at repair, follow-up duration, coarctation severity and concurrent anomalies. We previously developed an experimental model that replicates aortic pathology in humans with CoA without these confounding variables, and mimics correction at various times using dissolvable suture. Here we present the most comprehensive description of differentially expressed genes (DEGs) to date from the pathology of CoA, which were obtained using this model. Aortic samples (n=4/group) from the ascending aorta that experiences elevated blood pressure (BP) from induction of CoA, and restoration of normal BP after its correction, were analyzed by gene expression microarray, and enriched genes were converted to human orthologues. 51 DEGs with >6 fold-change (FC) were used to determine enriched Gene Ontology terms, altered pathways, and association with National Library of Medicine Medical Subject Headers (MeSH) IDs for HTN, cardiovascular disease (CVD) and CoA. The results generated 18 pathways, 4 of which (cell cycle, immune system, hemostasis and metabolism) were shared with MeSH ID's for HTN and CVD, and individual genes were associated with the CoA MeSH ID. A thorough literature search further uncovered association with contractile, cytoskeletal and regulatory proteins related to excitation-contraction coupling and metabolism that may explain the structural and functional changes observed in our experimental model, and ultimately help to unravel the mechanisms responsible for persistent morbidity after treatment for CoA.


Subject(s)
Aorta/metabolism , Aorta/pathology , Aortic Coarctation/genetics , Gene Expression , Animals , Aortic Coarctation/diagnosis , Aortic Coarctation/therapy , Disease Models, Animal , Gene Expression Profiling , Humans , Male , Molecular Sequence Annotation , Rabbits
SELECTION OF CITATIONS
SEARCH DETAIL
...