ABSTRACT
A commonly used method for evaluating a hospital's performance on an outcome is to compare the hospital's observed outcome rate to the hospital's expected outcome rate given its patient (case) mix and service. The process of calculating the hospital's expected outcome rate given its patient mix and service is called risk adjustment (Iezzoni 1997). Risk adjustment is critical for accurately evaluating and comparing hospitals' performances since we would not want to unfairly penalize a hospital just because it treats sicker patients. The key to risk adjustment is accurately estimating the probability of an Outcome given patient characteristics. For cases with binary outcomes, the method that is commonly used in risk adjustment is logistic regression. In this paper, we consider ensemble of trees methods as alternatives for risk adjustment, including random forests and Bayesian additive regression trees (BART). Both random forests and BART are modern machine learning methods that have been shown recently to have excellent performance for prediction of outcomes in many settings. We apply these methods to carry out risk adjustment for the performance of neonatal intensive care units (NICU). We show that these ensemble of trees methods outperform logistic regression in predicting mortality among babies treated in NICU, and provide a superior method of risk adjustment compared to logistic regression.
Subject(s)
Artificial Intelligence , Bayes Theorem , Hospital Mortality , Outcome Assessment, Health Care/methods , Risk Adjustment/methods , Birth Weight , Diagnosis-Related Groups/statistics & numerical data , Female , Gestational Age , Hospital Administration , Humans , Intensive Care Units, Neonatal , Logistic Models , Pregnancy , Pregnancy Complications/epidemiology , Premature Birth/epidemiology , Prenatal Care/statistics & numerical data , Socioeconomic FactorsABSTRACT
The PAF for an exposure is the fraction of disease cases in a population that can be attributed to that exposure. One method of estimating the PAF involves estimating the probability of having the disease given the exposure and confounding variables. In many settings, the exposure will interact with the confounders and the confounders will interact with each other. Also, in many settings, the probability of having the disease is thought, based on subject matter knowledge, to be a monotone increasing function of the exposure and possibly of some of the confounders. We develop an efficient approach for estimating logistic regression models with interactions and monotonicity constraints, and apply this approach to estimating the population attributable fraction (PAF). Our approach produces substantially more accurate estimates of the PAF in some settings than the usual approach which uses logistic regression without monotonicity constraints.
Subject(s)
Confounding Factors, Epidemiologic , Data Interpretation, Statistical , Logistic Models , Regression Analysis , Aged , Computer Simulation , Depression/psychology , Humans , Suicide/psychologyABSTRACT
For group-randomized trials, randomization inference based on rank statistics provides robust, exact inference against nonnormal distributions. However, in a matched-pair design, the currently available rank-based statistics lose significant power compared to normal linear mixed model (LMM) test statistics when the LMM is true. In this article, we investigate and develop an optimal test statistic over all statistics in the form of the weighted sum of signed Mann-Whitney-Wilcoxon statistics under certain assumptions. This test is almost as powerful as the LMM even when the LMM is true, but it is much more powerful for heavy tailed distributions. A simulation study is conducted to examine the power.