Search | VHL Regional Portal

A weighted FDR procedure under discrete and heterogeneous null distributions.

Chen, Xiongzhi; Doerge, R W; Sarkar, Sanat K.

Biom J ; 62(6): 1544-1563, 2020 10.

Article in English | MEDLINE | ID: mdl-32367597

ABSTRACT

Multiple testing (MT) with false discovery rate (FDR) control has been widely conducted in the "discrete paradigm" where p-values have discrete and heterogeneous null distributions. However, in this scenario existing FDR procedures often lose some power and may yield unreliable inference, and for this scenario there does not seem to be an FDR procedure that partitions hypotheses into groups, employs data-adaptive weights and is nonasymptotically conservative. We propose a weighted p-value-based FDR procedure, "weighted FDR (wFDR) procedure" for short, for MT in the discrete paradigm that efficiently adapts to both heterogeneity and discreteness of p-value distributions. We theoretically justify the nonasymptotic conservativeness of the wFDR procedure under independence, and show via simulation studies that, for MT based on p-values of binomial test or Fisher's exact test, it is more powerful than six other procedures. The wFDR procedure is applied to two examples based on discrete data, a drug safety study, and a differential methylation study, where it makes more discoveries than two existing methods.

Subject(s)

Biometry , Models, Statistical , Computer Simulation , Methylation , Pharmaceutical Preparations

Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories.

Guo, Wenge; Sarkar, Sanat K; Peddada, Shyamal D.

Biometrics ; 66(2): 485-92, 2010 Jun.

Article in English | MEDLINE | ID: mdl-19645703

ABSTRACT

Microarray gene expression studies over ordered categories are routinely conducted to gain insights into biological functions of genes and the underlying biological processes. Some common experiments are time-course/dose-response experiments where a tissue or cell line is exposed to different doses and/or durations of time to a chemical. A goal of such studies is to identify gene expression patterns/profiles over the ordered categories. This problem can be formulated as a multiple testing problem where for each gene the null hypothesis of no difference between the successive mean gene expressions is tested and further directional decisions are made if it is rejected. Much of the existing multiple testing procedures are devised for controlling the usual false discovery rate (FDR) rather than the mixed directional FDR (mdFDR), the expected proportion of Type I and directional errors among all rejections. Benjamini and Yekutieli (2005, Journal of the American Statistical Association 100, 71-93) proved that an augmentation of the usual Benjamini-Hochberg (BH) procedure can control the mdFDR while testing simple null hypotheses against two-sided alternatives in terms of one-dimensional parameters. In this article, we consider the problem of controlling the mdFDR involving multidimensional parameters. To deal with this problem, we develop a procedure extending that of Benjamini and Yekutieli based on the Bonferroni test for each gene. A proof is given for its mdFDR control when the underlying test statistics are independent across the genes. The results of a simulation study evaluating its performance under independence as well as under dependence of the underlying test statistics across the genes relative to other relevant procedures are reported. Finally, the proposed methodology is applied to a time-course microarray data obtained by Lobenhofer et al. (2002, Molecular Endocrinology 16, 1215-1229). We identified several important cell-cycle genes, such as DNA replication/repair gene MCM4 and replication factor subunit C2, which were not identified by the previous analyses of the same data by Lobenhofer et al. (2002) and Peddada et al. (2003, Bioinformatics 19, 834-841). Although some of our findings overlap with previous findings, we identify several other genes that complement the results of Lobenhofer et al. (2002).

Subject(s)

Artifacts , Artificial Intelligence , Gene Expression Profiling/statistics & numerical data , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Algorithms , Computational Biology , Gene Expression Profiling/methods , Humans , Kinetics , Methods , Oligonucleotide Array Sequence Analysis/methods , Sensitivity and Specificity , Time Factors

A Bayesian determination of threshold for identifying differentially expressed genes in microarray experiments.

Chen, Jie; Sarkar, Sanat K.

Stat Med ; 25(18): 3174-89, 2006 Sep 30.

Article in English | MEDLINE | ID: mdl-16345048

ABSTRACT

The original definitions of false discovery rate (FDR) and false non-discovery rate (FNR) can be understood as the frequentist risks of false rejections and false non-rejections, respectively, conditional on the unknown parameter, while the Bayesian posterior FDR and posterior FNR are conditioned on the data. From a Bayesian point of view, it seems natural to take into account the uncertainties in both the parameter and the data. In this spirit, we propose averaging out the frequentist risks of false rejections and false non-rejections with respect to some prior distribution of the parameters to obtain the average FDR (AFDR) and average FNR (AFNR), respectively. A linear combination of the AFDR and AFNR, called the average Bayes error rate (ABER), is considered as an overall risk. Some useful formulas for the AFDR, AFNR and ABER are developed for normal samples with hierarchical mixture priors. The idea of finding threshold values by minimizing the ABER or controlling the AFDR is illustrated using a gene expression data set. Simulation studies show that the proposed approaches are more powerful and robust than the widely used FDR method.

Subject(s)

Bayes Theorem , Oligonucleotide Array Sequence Analysis/methods , BRCA1 Protein/genetics , BRCA2 Protein/genetics , Breast Neoplasms/genetics , Computer Simulation , Female , Gene Expression , Humans , Mutation

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL