Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 49
Filter
1.
Emerg Infect Dis ; 29(9): 1789-1797, 2023 09.
Article in English | MEDLINE | ID: mdl-37610167

ABSTRACT

Brucellosis is a major public health concern worldwide, especially for persons living in resource-limited settings. Historically, an evidence-based estimate of the global annual incidence of human cases has been elusive. We used international public health data to fill this information gap through application of risk metrics to worldwide and regional at-risk populations. We performed estimations using 3 statistical models (weighted average interpolation, bootstrap resampling, and Bayesian inference) and considered missing information. An evidence-based conservative estimate of the annual global incidence is 2.1 million, significantly higher than was previously assumed. Our models indicate Africa and Asia sustain most of the global risk and cases, although areas within the Americas and Europe remain of concern. This study reveals that disease risk and incidence are higher than previously suggested and lie mainly within resource-limited settings. Clarification of both misdiagnosis and underdiagnosis is required because those factors will amplify case estimates.


Subject(s)
Brucellosis , Humans , Bayes Theorem , Incidence , Africa , Asia , Brucellosis/epidemiology
2.
Proc Natl Acad Sci U S A ; 120(8): e2217331120, 2023 02 21.
Article in English | MEDLINE | ID: mdl-36780516

ABSTRACT

Bayes factors represent a useful alternative to P-values for reporting outcomes of hypothesis tests by providing direct measures of the relative support that data provide to competing hypotheses. Unfortunately, the competing hypotheses have to be specified, and the calculation of Bayes factors in high-dimensional settings can be difficult. To address these problems, we define Bayes factor functions (BFFs) directly from common test statistics. BFFs depend on a single noncentrality parameter that can be expressed as a function of standardized effects, and plots of BFFs versus effect size provide informative summaries of hypothesis tests that can be easily aggregated across studies. Such summaries eliminate the need for arbitrary P-value thresholds to define "statistical significance." Because BFFs are defined using nonlocal alternative prior densities, they provide more rapid accumulation of evidence in favor of true null hypotheses without sacrificing efficiency in supporting true alternative hypotheses. BFFs can be expressed in closed form and can be computed easily from z, t, χ2, and F statistics.


Subject(s)
Research Design , Bayes Theorem
3.
Psychol Methods ; 2022 Apr 14.
Article in English | MEDLINE | ID: mdl-35420854

ABSTRACT

Bayesian hypothesis testing procedures have gained increased acceptance in recent years. A key advantage that Bayesian tests have over classical testing procedures is their potential to quantify information in support of true null hypotheses. Ironically, default implementations of Bayesian tests prevent the accumulation of strong evidence in favor of true null hypotheses because associated default alternative hypotheses assign a high probability to data that are most consistent with a null effect. We propose the use of "nonlocal" alternative hypotheses to resolve this paradox. The resulting class of Bayesian hypothesis tests permits more rapid accumulation of evidence in favor of both true null hypotheses and alternative hypotheses that are compatible with standardized effect sizes of most interest in psychology. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

4.
IEEE Access ; 10: 116844-116857, 2022.
Article in English | MEDLINE | ID: mdl-37275750

ABSTRACT

Clustering is a challenging problem in machine learning in which one attempts to group N objects into K0 groups based on P features measured on each object. In this article, we examine the case where N ≪ P and K0 is not known. Clustering in such high dimensional, small sample size settings has numerous applications in biology, medicine, the social sciences, clinical trials, and other scientific and experimental fields. Whereas most existing clustering algorithms either require the number of clusters to be known a priori or are sensitive to the choice of tuning parameters, our method does not require the prior specification of K0 or any tuning parameters. This represents an important advantage for our method because training data are not available in the applications we consider (i.e., in unsupervised learning problems). Without training data, estimating K0 and other hyperparameters-and thus applying alternative clustering algorithms-can be difficult and lead to inaccurate results. Our method is based on a simple transformation of the Gram matrix and application of the strong law of large numbers to the transformed matrix. If the correlation between features decays as the number of features grows, we show that the transformed feature vectors concentrate tightly around their respective cluster expectations in a low-dimensional space. This result simplifies the detection and visualization of the unknown cluster configuration. We illustrate the algorithm by applying it to 32 benchmarked microarray datasets, each containing thousands of genomic features measured on a relatively small number of tissue samples. Compared to 21 other commonly used clustering methods, we find that the proposed algorithm is faster and twice as accurate in determining the "best" cluster configuration.

5.
Bayesian Anal ; 16(1): 93-109, 2021 Mar.
Article in English | MEDLINE | ID: mdl-34113418

ABSTRACT

Uniformly most powerful Bayesian tests (UMPBT's) are an objective class of Bayesian hypothesis tests that can be considered the Bayesian counterpart of classical uniformly most powerful tests. Because the rejection regions of UMPBT's can be matched to the rejection regions of classical uniformly most powerful tests (UMPTs), UMPBT's provide a mechanism for calibrating Bayesian evidence thresholds, Bayes factors, classical significance levels and p-values. The purpose of this article is to expand the application of UMPBT's outside the class of exponential family models. Specifically, we introduce sufficient conditions for the existence of UMPBT's and propose a unified approach for their derivation. An important application of our methodology is the extension of UMPBT's to testing whether the non-centrality parameter of a chi-squared distribution is zero. The resulting tests have broad applicability, providing default alternative hypotheses to compute Bayes factors in, for example, Pearson's chi-squared test for goodness-of-fit, tests of independence in contingency tables, and likelihood ratio, score and Wald tests.

6.
Sci Rep ; 11(1): 3258, 2021 02 05.
Article in English | MEDLINE | ID: mdl-33547395

ABSTRACT

Checkpoint blockade-mediated immunotherapy is emerging as an effective treatment modality for multiple cancer types. However, cancer cells frequently evade the immune system, compromising the effectiveness of immunotherapy. It is crucial to develop screening methods to identify the patients who would most benefit from these therapies because of the risk of the side effects and the high cost of treatment. Here we show that expression of the MHC class I transactivator (CITA), NLRC5, is important for efficient responses to anti-CTLA-4 and anti-PD1 checkpoint blockade therapies. Melanoma tumors derived from patients responding to immunotherapy exhibited significantly higher expression of NLRC5 and MHC class I-related genes compared to non-responding patients. In addition, multivariate analysis that included the number of tumor-associated non-synonymous mutations, predicted neo-antigen load and PD-L2 expression was capable of further stratifying responders and non-responders to anti-CTLA4 therapy. Moreover, expression or methylation of NLRC5 together with total somatic mutation number were significantly correlated with increased patient survival. These results suggest that NLRC5 tumor expression, alone or together with tumor mutation load constitutes a valuable predictive biomarker for both prognosis and response to anti-CTLA-4 and potentially anti-PD1 blockade immunotherapy in melanoma patients.


Subject(s)
Gene Expression Regulation, Neoplastic/drug effects , Immune Checkpoint Inhibitors/therapeutic use , Intracellular Signaling Peptides and Proteins/genetics , Melanoma/drug therapy , Humans , Immunotherapy , Melanoma/diagnosis , Melanoma/genetics , Mutation/drug effects , Prognosis
7.
J Math Psychol ; 1012021 Apr.
Article in English | MEDLINE | ID: mdl-35496657

ABSTRACT

We describe a modified sequential probability ratio test that can be used to reduce the average sample size required to perform statistical hypothesis tests at specified levels of significance and power. Examples are provided for z tests, t tests, and tests of binomial success probabilities. A description of a software package to implement the test designs is provided. We compare the sample sizes required in fixed design tests conducted at 5% significance levels to the average sample sizes required in sequential tests conducted at 0.5% significance levels, and we find that the two sample sizes are approximately equal.

8.
Genome Res ; 30(8): 1170-1180, 2020 08.
Article in English | MEDLINE | ID: mdl-32817165

ABSTRACT

De novo mutations (DNMs) are increasingly recognized as rare disease causal factors. Identifying DNM carriers will allow researchers to study the likely distinct molecular mechanisms of DNMs. We developed Famdenovo to predict DNM status (DNM or familial mutation [FM]) of deleterious autosomal dominant germline mutations for any syndrome. We introduce Famdenovo.TP53 for Li-Fraumeni syndrome (LFS) and analyze 324 LFS family pedigrees from four US cohorts: a validation set of 186 pedigrees and a discovery set of 138 pedigrees. The concordance index for Famdenovo.TP53 prediction was 0.95 (95% CI: [0.92, 0.98]). Forty individuals (95% CI: [30, 50]) were predicted as DNM carriers, increasing the total number from 42 to 82. We compared clinical and biological features of FM versus DNM carriers: (1) cancer and mutation spectra along with parental ages were similarly distributed; (2) ascertainment criteria like early-onset breast cancer (age 20-35 yr) provides a condition for an unbiased estimate of the DNM rate: 48% (23 DNMs vs. 25 FMs); and (3) hotspot mutation R248W was not observed in DNMs, although it was as prevalent as hotspot mutation R248Q in FMs. Furthermore, we introduce Famdenovo.BRCA for hereditary breast and ovarian cancer syndrome and apply it to a small set of family data from the Cancer Genetics Network. In summary, we introduce a novel statistical approach to systematically evaluate deleterious DNMs in inherited cancer syndromes. Our approach may serve as a foundation for future studies evaluating how new deleterious mutations can be established in the germline, such as those in TP53.


Subject(s)
Breast Neoplasms/genetics , Genetic Predisposition to Disease/genetics , Germ-Line Mutation/genetics , Li-Fraumeni Syndrome/genetics , Ovarian Neoplasms/genetics , Adult , BRCA1 Protein/genetics , BRCA2 Protein/genetics , Breast Neoplasms/diagnosis , Family , Female , Humans , Pedigree , Tumor Suppressor Protein p53/genetics , Young Adult
9.
J Am Stat Assoc ; 115(532): 1784-1797, 2020.
Article in English | MEDLINE | ID: mdl-33716358

ABSTRACT

We introduce a new shrinkage prior on function spaces, called the functional horseshoe prior (fHS), that encourages shrinkage towards parametric classes of functions. Unlike other shrinkage priors for parametric models, the fHS shrinkage acts on the shape of the function rather than inducing sparsity on model parameters. We study the efficacy of the proposed approach by showing an adaptive posterior concentration property on the function. We also demonstrate consistency of the model selection procedure that thresholds the shrinkage parameter of the functional horseshoe prior. We apply the fHS prior to nonparametric additive models and compare its performance with procedures based on the standard horseshoe prior and several penalized likelihood approaches. We find that the new procedure achieves smaller estimation error and more accurate model selection than other procedures in several simulated and real examples. The supplementary material for this article, which contains additional simulated and real data examples, MCMC diagnostics, and proofs of the theoretical results, is available online.

10.
Ann Appl Stat ; 14(2): 809-828, 2020 Jun.
Article in English | MEDLINE | ID: mdl-33456641

ABSTRACT

Efficient variable selection in high dimensional cancer genomic studies is critical for discovering genes associated with specific cancer types and for predicting response to treatment. Censored survival data is prevalent in such studies. In this article we introduce a Bayesian variable selection procedure that uses a mixture prior composed of a point mass at zero and an inverse moment prior in conjunction with the partial likelihood defined by the Cox proportional hazard model. The procedure is implemented in the R package BVSNLP, which supports parallel computing and uses a stochastic search method to explore the model space. Bayesian model averaging is used for prediction. The proposed algorithm provides better performance than other variable selection procedures in simulation studies, and appears to provide more consistent variable selection when applied to actual genomic datasets.

11.
Am Stat ; 73(Suppl 1): 129-134, 2019.
Article in English | MEDLINE | ID: mdl-31123367

ABSTRACT

This article examines the evidence contained in t statistics that are marginally significant in 5% tests. The bases for evaluating evidence are likelihood ratios and integrated likelihood ratios, computed under a variety of assumptions regarding the alternative hypotheses in null hypothesis significance tests. Likelihood ratios and integrated likelihood ratios provide a useful measure of the evidence in favor of competing hypotheses because they can be interpreted as representing the ratio of the probabilities that each hypothesis assigns to observed data. When they are either very large or very small, they suggest that one hypothesis is much better than the other in predicting observed data. If they are close to 1.0, then both hypotheses provide approximately equally valid explanations for observed data. I find that p-values that are close to 0.05 (i.e., that are "marginally significant") correspond to integrated likelihood ratios that are bounded by approximately 7 in two-sided tests, and by approximately 4 in one-sided tests. The modest magnitude of integrated likelihood ratios corresponding to p-values close to 0.05 clearly suggests that higher standards of evidence are needed to support claims of novel discoveries and new effects.

12.
Nature ; 567(7749): 461, 2019 03.
Article in English | MEDLINE | ID: mdl-30903097
13.
Bioinformatics ; 35(1): 1-11, 2019 01 01.
Article in English | MEDLINE | ID: mdl-29931045

ABSTRACT

Motivation: Multiple marker analysis of the genome-wide association study (GWAS) data has gained ample attention in recent years. However, because of the ultra high-dimensionality of GWAS data, such analysis is challenging. Frequently used penalized regression methods often lead to large number of false positives, whereas Bayesian methods are computationally very expensive. Motivated to ameliorate these issues simultaneously, we consider the novel approach of using non-local priors in an iterative variable selection framework. Results: We develop a variable selection method, named, iterative non-local prior based selection for GWAS, or GWASinlps, that combines, in an iterative variable selection framework, the computational efficiency of the screen-and-select approach based on some association learning and the parsimonious uncertainty quantification provided by the use of non-local priors. The hallmark of our method is the introduction of 'structured screen-and-select' strategy, that considers hierarchical screening, which is not only based on response-predictor associations, but also based on response-response associations and concatenates variable selection within that hierarchy. Extensive simulation studies with single nucleotide polymorphisms having realistic linkage disequilibrium structures demonstrate the advantages of our computationally efficient method compared to several frequentist and Bayesian variable selection methods, in terms of true positive rate, false discovery rate, mean squared error and effect size estimation error. Further, we provide empirical power analysis useful for study design. Finally, a real GWAS data application was considered with human height as phenotype. Availability and implementation: An R-package for implementing the GWASinlps method is available at https://cran.r-project.org/web/packages/GWASinlps/index.html. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome-Wide Association Study , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Software , Bayes Theorem , Computational Biology , Humans , Regression Analysis
14.
Stat Sin ; 28(2): 1053-1078, 2018 Apr.
Article in English | MEDLINE | ID: mdl-29643721

ABSTRACT

Bayesian model selection procedures based on nonlocal alternative prior densities are extended to ultrahigh dimensional settings and compared to other variable selection procedures using precision-recall curves. Variable selection procedures included in these comparisons include methods based on g-priors, reciprocal lasso, adaptive lasso, scad, and minimax concave penalty criteria. The use of precision-recall curves eliminates the sensitivity of our conclusions to the choice of tuning parameters. We find that Bayesian variable selection procedures based on nonlocal priors are competitive to all other procedures in a range of simulation scenarios, and we subsequently explain this favorable performance through a theoretical examination of their consistency properties. When certain regularity conditions apply, we demonstrate that the nonlocal procedures are consistent for linear models even when the number of covariates p increases sub-exponentially with the sample size n. A model selection procedure based on Zellner's g-prior is also found to be competitive with penalized likelihood methods in identifying the true model, but the posterior distribution on the model space induced by this method is much more dispersed than the posterior distribution induced on the model space by the nonlocal prior methods. We investigate the asymptotic form of the marginal likelihood based on the nonlocal priors and show that it attains a unique term that cannot be derived from the other Bayesian model selection procedures. We also propose a scalable and efficient algorithm called Simplified Shotgun Stochastic Search with Screening (S5) to explore the enormous model space, and we show that S5 dramatically reduces the computing time without losing the capacity to search the interesting region in the model space, at least in the simulation settings considered. The S5 algorithm is available in an R package BayesS5 on CRAN.

15.
J Am Stat Assoc ; 112(517): 1-10, 2017.
Article in English | MEDLINE | ID: mdl-29861517

ABSTRACT

Investigators from a large consortium of scientists recently performed a multi-year study in which they replicated 100 psychology experiments. Although statistically significant results were reported in 97% of the original studies, statistical significance was achieved in only 36% of the replicated studies. This article presents a reanalysis of these data based on a formal statistical model that accounts for publication bias by treating outcomes from unpublished studies as missing data, while simultaneously estimating the distribution of effect sizes for those studies that tested nonnull effects. The resulting model suggests that more than 90% of tests performed in eligible psychology experiments tested negligible effects, and that publication biases based on p-values caused the observed rates of nonreproducibility. The results of this reanalysis provide a compelling argument for both increasing the threshold required for declaring scientific discoveries and for adopting statistical summaries of evidence that account for the high proportion of tested hypotheses that are false. Supplementary materials for this article are available online.

16.
Bioinformatics ; 32(9): 1338-45, 2016 05 01.
Article in English | MEDLINE | ID: mdl-26740524

ABSTRACT

MOTIVATION: The advent of new genomic technologies has resulted in the production of massive data sets. Analyses of these data require new statistical and computational methods. In this article, we propose one such method that is useful in selecting explanatory variables for prediction of a binary response. Although this problem has recently been addressed using penalized likelihood methods, we adopt a Bayesian approach that utilizes a mixture of non-local prior densities and point masses on the binary regression coefficient vectors. RESULTS: The resulting method, which we call iMOMLogit, provides improved performance in identifying true models and reducing estimation and prediction error in a number of simulation studies. More importantly, its application to several genomic datasets produces predictions that have high accuracy using far fewer explanatory variables than competing methods. We also describe a novel approach for setting prior hyperparameters by examining the total variation distance between the prior distributions on the regression parameters and the distribution of the maximum likelihood estimator under the null distribution. Finally, we describe a computational algorithm that can be used to implement iMOMLogit in ultrahigh-dimensional settings ([Formula: see text]) and provide diagnostics to assess the probability that this algorithm has identified the highest posterior probability model. AVAILABILITY AND IMPLEMENTATION: Software to implement this method can be downloaded at: http://www.stat.tamu.edu/∼amir/code.html CONTACT: wwang7@mdanderson.org or vjohnson@stat.tamu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genomics , Software , Algorithms , Animals , Bayes Theorem , Humans , Likelihood Functions
17.
Biostatistics ; 17(2): 249-63, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26486139

ABSTRACT

We propose a Bayesian phase I/II dose-finding trial design that simultaneously accounts for toxicity and efficacy. We model the toxicity and efficacy of investigational doses using a flexible Bayesian dynamic model, which borrows information across doses without imposing stringent parametric assumptions on the shape of the dose-toxicity and dose-efficacy curves. An intuitive utility function that reflects the desirability trade-offs between efficacy and toxicity is used to guide the dose assignment and selection. We also discuss the extension of this design to handle delayed toxicity and efficacy. We conduct extensive simulation studies to examine the operating characteristics of the proposed method under various practical scenarios. The results show that the proposed design possesses good operating characteristics and is robust to the shape of the dose-toxicity and dose-efficacy curves.


Subject(s)
Bayes Theorem , Clinical Trials, Phase I as Topic , Clinical Trials, Phase II as Topic , Dose-Response Relationship, Drug , Research Design , Humans
19.
AJR Am J Roentgenol ; 202(4): 703-10, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24660695

ABSTRACT

OBJECTIVE: The purpose of this study was to develop a method of measuring rectal radiation dose in vivo during CT colonography (CTC) and assess the accuracy of size-specific dose estimates (SSDEs) relative to that of in vivo dose measurements. MATERIALS AND METHODS: Thermoluminescent dosimeter capsules were attached to a CTC rectal catheter to obtain four measurements of the CT radiation dose in 10 volunteers (five men and five women; age range, 23-87 years; mean age, 70.4 years). A fixed CT technique (supine and prone, 50 mAs and 120 kVp each) was used for CTC. SSDEs and percentile body habitus measurements were based on CT images and directly compared with in vivo dose measurements. RESULTS: The mean absorbed doses delivered to the rectum ranged from 8.8 to 23.6 mGy in the 10 patients, whose mean body habitus was in the 27th percentile among American adults 18-64 years old (range, 0.5-67th percentile). The mean SSDE error was 7.2% (range, 0.6-31.4%). CONCLUSION: This in vivo radiation dose measurement technique can be applied to patients undergoing CTC. Our measurements indicate that SSDEs are reasonable estimates of the rectal absorbed dose. The data obtained in this pilot study can be used as benchmarks for assessing dose estimates using other indirect methods (e.g., Monte Carlo simulations).


Subject(s)
Colonography, Computed Tomographic , Radiation Dosage , Rectum/radiation effects , Thermoluminescent Dosimetry/instrumentation , Adult , Aged , Aged, 80 and over , Female , Humans , Male , Middle Aged , Monte Carlo Method , Pilot Projects
20.
Biometrics ; 70(2): 366-77, 2014 Jun.
Article in English | MEDLINE | ID: mdl-24575781

ABSTRACT

To evaluate the utility of automated deformable image registration (DIR) algorithms, it is necessary to evaluate both the registration accuracy of the DIR algorithm itself, as well as the registration accuracy of the human readers from whom the "gold standard" is obtained. We propose a Bayesian hierarchical model to evaluate the spatial accuracy of human readers and automatic DIR methods based on multiple image registration data generated by human readers and automatic DIR methods. To fully account for the locations of landmarks in all images, we treat the true locations of landmarks as latent variables and impose a hierarchical structure on the magnitude of registration errors observed across image pairs. DIR registration errors are modeled using Gaussian processes with reference prior densities on prior parameters that determine the associated covariance matrices. We develop a Gibbs sampling algorithm to efficiently fit our models to high-dimensional data, and apply the proposed method to analyze an image dataset obtained from a 4D thoracic CT study.


Subject(s)
Algorithms , Image Interpretation, Computer-Assisted/methods , Models, Statistical , Bayes Theorem , Biometry/methods , Computer Simulation , Expert Testimony , Four-Dimensional Computed Tomography/statistics & numerical data , Humans , Normal Distribution
SELECTION OF CITATIONS
SEARCH DETAIL
...