Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Genes (Basel) ; 15(3)2024 03 07.
Article in English | MEDLINE | ID: mdl-38540403

ABSTRACT

The false discovery rate (FDR) is a widely used metric of statistical significance for genomic data analyses that involve multiple hypothesis testing. Power and sample size considerations are important in planning studies that perform these types of genomic data analyses. Here, we propose a three-rectangle approximation of a p-value histogram to derive a formula to compute the statistical power and sample size for analyses that involve the FDR. We also introduce the R package FDRsamplesize2, which incorporates these and other power calculation formulas to compute power for a broad variety of studies not covered by other FDR power calculation software. A few illustrative examples are provided. The FDRsamplesize2 package is available on CRAN.


Subject(s)
Algorithms , Software , Sample Size , Research Design , Genomics
2.
PLoS One ; 19(3): e0300638, 2024.
Article in English | MEDLINE | ID: mdl-38547174

ABSTRACT

While time-to-event data are often continuous, there are several instances where discrete survival data, which are inherently ordinal, may be available or are more appropriate or useful. Several discrete survival models exist, but the forward continuation ratio model with a complementary log-log link has a survival interpretation and is closely related to the Cox proportional hazards model, despite being an ordinal model. This model has previously been implemented in the high-dimensional setting using the ordinal generalized monotone incremental forward stagewise algorithm. Here, we propose a Bayesian penalized forward continuation ratio model with a complementary log-log link and explore different priors to perform variable selection and regularization. Through simulations, we show that our Bayesian model outperformed the existing frequentist method in terms of variable selection performance, and that a 10% prior inclusion probability performed better than 1% or 50%. We also illustrate our model on a publicly available acute myeloid leukemia dataset to identify genomic features associated with discrete survival. We identified nine features that map to ten unique genes, five of which have been previously associated with leukemia in the literature. In conclusion, our proposed Bayesian model is flexible, allows simultaneous variable selection and uncertainty quantification, and performed well in simulation studies and application to real data.


Subject(s)
Algorithms , Genomics , Bayes Theorem , Proportional Hazards Models , Computer Simulation
3.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36184192

ABSTRACT

For many high-dimensional genomic and epigenomic datasets, the outcome of interest is ordinal. While these ordinal outcomes are often thought of as the observed cutpoints of some latent continuous variable, some ordinal outcomes are truly discrete and are comprised of the subjective combination of several factors. The nonlinear stereotype logistic model, which does not assume proportional odds, was developed for these 'assessed' ordinal variables. It has previously been extended to the frequentist high-dimensional feature selection setting, but the Bayesian framework provides some distinct advantages in terms of simultaneous uncertainty quantification and variable selection. Here, we review the stereotype model and Bayesian variable selection methods and demonstrate how to combine them to select genomic features associated with discrete ordinal outcomes. We compared the Bayesian and frequentist methods in terms of variable selection performance. We additionally applied the Bayesian stereotype method to an acute myeloid leukemia RNA-sequencing dataset to further demonstrate its variable selection abilities by identifying features associated with the European LeukemiaNet prognostic risk score.


Subject(s)
Genomics , Logistic Models , Bayes Theorem , Risk Factors
4.
Stats (Basel) ; 5(2): 371-384, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35574500

ABSTRACT

The stage of cancer is a discrete ordinal response that indicates the aggressiveness of disease and is often used by physicians to determine the type and intensity of treatment to be administered. For example, the FIGO stage in cervical cancer is based on the size and depth of the tumor as well as the level of spread. It may be of clinical relevance to identify molecular features from high-throughput genomic assays that are associated with the stage of cervical cancer to elucidate pathways related to tumor aggressiveness, identify improved molecular features that may be useful for staging, and identify therapeutic targets. High-throughput RNA-Seq data and corresponding clinical data (including stage) for cervical cancer patients have been made available through The Cancer Genome Atlas Project (TCGA). We recently described penalized Bayesian ordinal response models that can be used for variable selection for over-parameterized datasets, such as the TCGA-CESC dataset. Herein, we describe our ordinalbayes R package, available from the Comprehensive R Archive Network (CRAN), which enhances the runjags R package by enabling users to easily fit cumulative logit models when the outcome is ordinal and the number of predictors exceeds the sample size, P > N, such as for TCGA and other high-throughput genomic data. We demonstrate the use of this package by applying it to the TCGA cervical cancer dataset. Our ordinalbayes package can be used to fit models to high-dimensional datasets, and it effectively performs variable selection.

5.
Clin Epigenetics ; 13(1): 188, 2021 10 11.
Article in English | MEDLINE | ID: mdl-34635168

ABSTRACT

BACKGROUND: Racial/ethnic disparities in health reflect a combination of genetic and environmental causes, and DNA methylation may be an important mediator. We compared in an exploratory manner the blood DNA methylome of Japanese Americans (JPA) versus European Americans (EUA). METHODS: Genome-wide buffy coat DNA methylation was profiled among healthy Multiethnic Cohort participant women who were Japanese (JPA; n = 30) or European (EUA; n = 28) Americans aged 60-65. Differentially methylated CpGs by race/ethnicity (DM-CpGs) were identified by linear regression (Bonferroni-corrected P < 0.1) and analyzed in relation to corresponding gene expression, a priori selected single nucleotide polymorphisms (SNPs), and blood biomarkers of inflammation and metabolism using Pearson or Spearman correlations (FDR < 0.1). RESULTS: We identified 174 DM-CpGs with the majority of hypermethylated in JPA compared to EUA (n = 133), often in promoter regions (n = 48). Half (51%) of the genes corresponding to the DM-CpGs were involved in liver function and liver disease, and the methylation in nine genes was significantly correlated with gene expression for DM-CpGs. A total of 156 DM-CpGs were associated with rs7489665 (SH2B1). Methylation of DM-CpGs was correlated with blood levels of the cytokine MIP1B (n = 146). We confirmed some of the DM-CpGs in the TCGA adjacent non-tumor liver tissue of Asians versus EUA. CONCLUSION: We found a number of differentially methylated CpGs in blood DNA between JPA and EUA women with a potential link to liver disease, specific SNPs, and systemic inflammation. These findings may support further research on the role of DNA methylation in mediating some of the higher risk of liver disease among JPA.


Subject(s)
Asian People/ethnology , DNA Methylation/genetics , Ethnicity/genetics , White People/ethnology , Adaptor Proteins, Signal Transducing/analysis , Adaptor Proteins, Signal Transducing/blood , Aged , Asian People/statistics & numerical data , Cohort Studies , DNA Methylation/physiology , Ethnicity/statistics & numerical data , Female , Genome-Wide Association Study , Humans , Japan/ethnology , Male , Middle Aged , United States/ethnology , White People/statistics & numerical data
SELECTION OF CITATIONS
SEARCH DETAIL
...