Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Stat Sin ; 33(2): 729-758, 2023 Apr.
Article in English | MEDLINE | ID: mdl-38037567

ABSTRACT

This study has been motivated by cancer research, in which heterogeneity analysis plays an important role and can be roughly classified as unsupervised or supervised. In supervised heterogeneity analysis, the finite mixture of regression (FMR) technique is used extensively, under which the covariates affect the response differently in subgroups. High-dimensional molecular and, very recently, histopathological imaging features have been analyzed separately and shown to be effective for heterogeneity analysis. For simpler analysis, they have been shown to contain overlapping, but also independent information. In this article, our goal is to conduct the first and more effective FMR-based cancer heterogeneity analysis by integrating high-dimensional molecular and histopathological imaging features. A penalization approach is developed to regularize estimation, select relevant variables, and, equally importantly, promote the identification of independent information. Consistency properties are rigorously established. An effective computational algorithm is developed. A simulation and an analysis of The Cancer Genome Atlas (TCGA) lung cancer data demonstrate the practical effectiveness of the proposed approach. Overall, this study provides a practical and useful new way of conducting supervised cancer heterogeneity analysis.

2.
Biometrics ; 78(4): 1579-1591, 2022 12.
Article in English | MEDLINE | ID: mdl-34390584

ABSTRACT

In cancer research, supervised heterogeneity analysis has important implications. Such analysis has been traditionally based on clinical/demographic/molecular variables. Recently, histopathological imaging features, which are generated as a byproduct of biopsy, have been shown as effective for modeling cancer outcomes, and a handful of supervised heterogeneity analysis has been conducted based on such features. There are two types of histopathological imaging features, which are extracted based on specific biological knowledge and using automated imaging processing software, respectively. Using both types of histopathological imaging features, our goal is to conduct the first supervised cancer heterogeneity analysis that satisfies a hierarchical structure. That is, the first type of imaging features defines a rough structure, and the second type defines a nested and more refined structure. A penalization approach is developed, which has been motivated by but differs significantly from penalized fusion and sparse group penalization. It has satisfactory statistical and numerical properties. In the analysis of lung adenocarcinoma data, it identifies a heterogeneity structure significantly different from the alternatives and has satisfactory prediction and stability performance.


Subject(s)
Neoplasms , Humans , Neoplasms/diagnostic imaging , Software
3.
Biometrics ; 77(4): 1397-1408, 2021 12.
Article in English | MEDLINE | ID: mdl-32822084

ABSTRACT

Heterogeneity is a hallmark of cancer. For various cancer outcomes/phenotypes, supervised heterogeneity analysis has been conducted, leading to a deeper understanding of disease biology and customized clinical decisions. In the literature, such analysis has been oftentimes based on demographic, clinical, and omics measurements. Recent studies have shown that high-dimensional histopathological imaging features contain valuable information on cancer outcomes. However, comparatively, heterogeneity analysis based on imaging features has been very limited. In this article, we conduct supervised cancer heterogeneity analysis using histopathological imaging features. The penalized fusion technique, which has notable advantages-such as greater flexibility-over the finite mixture modeling and other techniques, is adopted. A sparse penalization is further imposed to accommodate high dimensionality and select relevant imaging features. To improve computational feasibility and generate more reliable estimation, we employ model averaging. Computational and statistical properties of the proposed approach are carefully investigated. Simulation demonstrates its favorable performance. The analysis of The Cancer Genome Atlas (TCGA) data may provide a new way of defining/examining breast cancer heterogeneity.


Subject(s)
Breast Neoplasms , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/genetics , Computer Simulation , Female , Humans
4.
Sci Rep ; 10(1): 15030, 2020 09 14.
Article in English | MEDLINE | ID: mdl-32929170

ABSTRACT

For lung and many other cancers, prognosis is essentially important, and extensive modeling has been carried out. Cancer is a genetic disease. In the past 2 decades, diverse molecular data (such as gene expressions and DNA mutations) have been analyzed in prognosis modeling. More recently, histopathological imaging data, which is a "byproduct" of biopsy, has been suggested as informative for prognosis. In this article, with the TCGA LUAD and LUSC data, we examine and directly compare modeling lung cancer overall survival using gene expressions versus histopathological imaging features. High-dimensional penalization methods are adopted for estimation and variable selection. Our findings include that gene expressions have slightly better prognostic performance, and that most of the gene expressions are weakly correlated imaging features. This study may provide additional insight into utilizing the two types of important data in cancer prognosis modeling and into lung cancer overall survival.


Subject(s)
Biomarkers, Tumor/genetics , Lung Neoplasms/genetics , Aged , Biomarkers, Tumor/metabolism , Biomarkers, Tumor/standards , Computational Biology/methods , Computational Biology/standards , Cytodiagnosis/methods , Cytodiagnosis/standards , Diagnosis, Computer-Assisted/methods , Diagnosis, Computer-Assisted/standards , Female , Gene Expression Regulation, Neoplastic , Humans , Lung Neoplasms/metabolism , Lung Neoplasms/pathology , Male , Middle Aged , Prognosis
5.
BMC Complement Med Ther ; 20(1): 34, 2020 Feb 05.
Article in English | MEDLINE | ID: mdl-32024509

ABSTRACT

BACKGROUND: The current work aimed to assess whether Gynostemma pentaphyllum (GP), a Chinese herbal medicine, structurally modifies the gut microbiota in rats during non-alcoholic fatty liver disease (NAFLD) treatment. METHODS: High-fat diet (HFD)-induced NAFLD rats were orally administered water decoction of GP or equal amounts of distilled water per day for 4 weeks. Liver tissues were examined by histopathological observation, while intestinal tissues were examined by both histopathological and ultrastructural observations. The levels of fasting blood glucose (FBG), fasting serum insulin (FINS), total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), alanine transaminase (ALT) and aspartate transaminase (AST) were measured by enzymatic method. The levels of toll-like receptor 4 (TLR-4), tumor necrosis factor-alpha (TNF-α), interleukin-1-beta (IL-1ß) and interleukin-6 (IL-6) in both serum and hepatic tissues were measured by RT-qPCR. The protein expression level of TLR-4 in hepatic tissues was detected by western blot. The gut microbiota was assessed by 16S rRNA-based microbiota analysis. RESULTS: GP maintained intestinal integrity and reversed gut dysbiosis in high-fat diet (HFD)-induced NAFLD rats. This also reduced the ratio of Firmicutes to Bacteroidetes, enriching the abundance of beneficial bacteria (Lactococcus spp.) and inhibiting the abundance of pathogenic bacteria (Ruminococcus spp.) in the gut. The levels of pro-inflammatory cytokines (TNF-α, IL-1ß and IL-6) and the expression of TLR4 were downregulated (P < 0.05), while the insulin resistance index, HOMA-IR showed improvement by GP treatment (P < 0.05). Liver function indicators (ALT and AST) were remarkably decreased (P < 0.01). Besides, GP treatment reduced TG and LDL-C levels (P < 0.05), and increased HDL-C level (P < 0.05) compared with NAFLD group. CONCLUSION: The structural alterations of gut microbiota induced by GP are associated with NAFLD alleviation.


Subject(s)
Drugs, Chinese Herbal/pharmacology , Gastrointestinal Microbiome/drug effects , Gynostemma , Non-alcoholic Fatty Liver Disease/drug therapy , Plant Extracts/pharmacology , Animals , China , Cytokines/metabolism , Diet, High-Fat , Disease Models, Animal , Intestines/drug effects , Liver/drug effects , Male , RNA, Ribosomal, 16S/metabolism , Rats , Rats, Sprague-Dawley
6.
Cancers (Basel) ; 11(4)2019 Apr 24.
Article in English | MEDLINE | ID: mdl-31022926

ABSTRACT

Histopathological imaging has been routinely conducted in cancer diagnosis and recently used for modeling other cancer outcomes/phenotypes such as prognosis. Clinical/environmental factors have long been extensively used in cancer modeling. However, there is still a lack of study exploring possible interactions of histopathological imaging features and clinical/environmental risk factors in cancer modeling. In this article, we explore such a possibility and conduct both marginal and joint interaction analysis. Novel statistical methods, which are "borrowed" from gene⁻environment interaction analysis, are employed. Analysis of The Cancer Genome Atlas (TCGA) lung adenocarcinoma (LUAD) data is conducted. More specifically, we examine a biomarker of lung function as well as overall survival. Possible interaction effects are identified. Overall, this study can suggest an alternative way of cancer modeling that innovatively combines histopathological imaging and clinical/environmental data.

7.
Cancers (Basel) ; 11(3)2019 Mar 13.
Article in English | MEDLINE | ID: mdl-30871256

ABSTRACT

Cancer prognosis is of essential interest, and extensive research has been conducted searching for biomarkers with prognostic power. Recent studies have shown that both omics profiles and histopathological imaging features have prognostic power. There are also studies exploring integrating the two types of measurements for prognosis modeling. However, there is a lack of study rigorously examining whether omics measurements have independent prognostic power conditional on histopathological imaging features, and vice versa. In this article, we adopt a rigorous statistical testing framework and test whether an individual gene expression measurement can improve prognosis modeling conditional on high-dimensional imaging features, and a parallel analysis is conducted reversing the roles of gene expressions and imaging features. In the analysis of The Cancer Genome Atlas (TCGA) lung adenocarcinoma and liver hepatocellular carcinoma data, it is found that multiple individual genes, conditional on imaging features, can lead to significant improvement in prognosis modeling; however, individual imaging features, conditional on gene expressions, only offer limited prognostic power. Being among the first to examine the independent prognostic power, this study may assist better understanding the "connectedness" between omics profiles and histopathological imaging features and provide important insights for data integration in cancer modeling.

8.
Biochim Biophys Acta Mol Basis Dis ; 1864(6 Pt B): 2274-2283, 2018 Jun.
Article in English | MEDLINE | ID: mdl-29241666

ABSTRACT

Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological control. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang.


Subject(s)
Circadian Rhythm/physiology , Databases, Genetic , Gene Expression Profiling/methods , Machine Learning , Transcriptome/physiology , Humans
9.
Biomed Res Int ; 2016: 2714341, 2016.
Article in English | MEDLINE | ID: mdl-27437397

ABSTRACT

Background. With the development of massively parallel sequencing (MPS), noninvasive prenatal diagnosis using maternal cell-free DNA is fast becoming the preferred method of fetal chromosomal abnormality detection, due to its inherent high accuracy and low risk. Typically, MPS data is parsed to calculate a risk score, which is used to predict whether a fetal chromosome is normal or not. Although there are several highly sensitive and specific MPS data-parsing algorithms, there are currently no tools that implement these methods. Results. We developed an R package, detection of autosomal abnormalities for fetus (DASAF), that implements the three most popular trisomy detection methods-the standard Z-score (STDZ) method, the GC correction Z-score (GCCZ) method, and the internal reference Z-score (IRZ) method-together with one subchromosome abnormality identification method (SCAZ). Conclusions. With the cost of DNA sequencing declining and with advances in personalized medicine, the demand for noninvasive prenatal testing will undoubtedly increase, which will in turn trigger an increase in the tools available for subsequent analysis. DASAF is a user-friendly tool, implemented in R, that supports identification of whole-chromosome as well as subchromosome abnormalities, based on maternal cell-free DNA sequencing data after genome mapping.


Subject(s)
Chromosome Aberrations/embryology , DNA/analysis , DNA/genetics , Fetus/pathology , High-Throughput Nucleotide Sequencing/methods , Software , Cell-Free System , Databases, Genetic , Female , Humans , Pregnancy , Reference Standards , Time Factors
10.
Genomics ; 101(1): 20-3, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23000193

ABSTRACT

Genome-wide association (GWA) studies are currently one of the most powerful tools in identifying disease-associated genes or variants. In typical GWA studies, single-nucleotide polymorphisms (SNPs) are often used as genetic makers. Therefore, it is critical to estimate the percentage of genetic variations which can be covered by SNPs through linkage disequilibrium (LD). In this study, we use the concept of haplotype blocks to evaluate the coverage of five SNP sets including the HapMap and four commercial arrays, for every exon in the human genome. We show that although some Chips can reach similar coverage as the HapMap, only about 50% of exons are completely covered by haplotype blocks of HapMap SNPs. We suggest further high-resolution genotyping methods are required, to provide adequate genome-wide power for identifying variants.


Subject(s)
Exons , HapMap Project , Polymorphism, Single Nucleotide , Genome, Human , Genotyping Techniques/standards , Haplotypes , Humans , Linkage Disequilibrium , Quality Control
SELECTION OF CITATIONS
SEARCH DETAIL
...