Search | VHL Regional Portal

A Comparison of Models Predicting One-Year Mortality at Time of Admission.

Pierce, Robert P; Raithel, Seth; Brandt, Lea; Clary, Kevin W; Craig, Kevin.

J Pain Symptom Manage ; 63(3): e287-e293, 2022 Mar.

Article in English | MEDLINE | ID: mdl-34826545

ABSTRACT

CONTEXT: Hospitalization provides an opportunity to address end-of-life care (EoLC) preferences if patients at risk of death can be accurately identified while in the hospital. The modified Hospital One-Year Mortality Risk (mHOMR) uses demographic and admission data in a logistic regression algorithm to identify patients at risk of death one year from admission. OBJECTIVES: This project sought to validate mHOMR and identify superior models. METHODS: The mHOMR model was validated using historical data from an academic health system. Alternative logistic regression and random forest (RF) models were developed using the same variables. Receiver operating characteristic (ROC) and precision recall curves were developed, and sensitivity, specificity, and positive and negative predictive values were compared over a range of model thresholds. RESULTS: The RF model demonstrated higher area under the ROC curve (0.950, 95% CI 0.947 - 0.954) as compared to the logistic regression models (0.818 [95% CI 0.812 - 0.825] and 0.841 [95% CI 0.836 - 0.847]). Area under the precision recall curve was higher with the random forest model compared to the logistic regression models (0.863 vs. 0.458 and 0.494, respectively). Across a range of thresholds, the RF model demonstrated superior sensitivity, equivalent specificity, and higher positive and negative predictive values. CONCLUSION: A machine learning RF model, using common demographic and utilization data available on hospital admission, identified inpatients at risk of death more effectively than logistic regression models using the same variables. Machine learning models have promise for identifying admitted patients with elevated one-year mortality risk, increasing opportunities to prompt discussion of EoLC preferences.

Subject(s)

Hospitalization , Machine Learning , Hospital Mortality , Humans , Logistic Models , ROC Curve , Retrospective Studies

Inferential considerations for low-count RNA-seq transcripts: a case study on the dominant prairie grass Andropogon gerardii.

Raithel, Seth; Johnson, Loretta; Galliart, Matthew; Brown, Sue; Shelton, Jennifer; Herndon, Nicolae; Bello, Nora M.

BMC Genomics ; 17: 140, 2016 Feb 27.

Article in English | MEDLINE | ID: mdl-26919855

ABSTRACT

BACKGROUND: Differential expression (DE) analysis of RNA-seq data still poses inferential challenges, such as handling of transcripts characterized by low expression levels. In this study, we use a plasmode-based approach to assess the relative performance of alternative inferential strategies on RNA-seq transcripts, with special emphasis on transcripts characterized by a small number of read counts, so-called low-count transcripts, as motivated by an ecological application in prairie grasses. Big bluestem (Andropogon gerardii) is a wide-ranging dominant prairie grass of ecological and agricultural importance to the US Midwest while edaphic subspecies sand bluestem (A. gerardii ssp. Hallii) grows exclusively on sand dunes. Relative to big bluestem, sand bluestem exhibits qualitative phenotypic divergence consistent with enhanced drought tolerance, plausibly associated with transcripts of low expression levels. Our dataset consists of RNA-seq read counts for 25,582 transcripts (60% of which are classified as low-count) collected from leaf tissue of individual plants of big bluestem (n = 4) and sand bluestem (n = 4). Focused on low-count transcripts, we compare alternative ad-hoc data filtering techniques commonly used in RNA-seq pipelines and assess the inferential performance of recently developed statistical methods for DE analysis, namely DESeq2 and edgeR robust. These methods attempt to overcome the inherently noisy behavior of low-count transcripts by either shrinkage or differential weighting of observations, respectively. RESULTS: Both DE methods seemed to properly control family-wise type 1 error on low-count transcripts, whereas edgeR robust showed greater power and DESeq2 showed greater precision and accuracy. However, specification of the degree of freedom parameter under edgeR robust had a non-trivial impact on inference and should be handled carefully. When properly specified, both DE methods showed overall promising inferential performance on low-count transcripts, suggesting that ad-hoc data filtering steps at arbitrary expression thresholds may be unnecessary. A note of caution is in order regarding the approximate nature of DE tests under both methods. CONCLUSIONS: Practical recommendations for DE inference are provided when low-count RNA-seq transcripts are of interest, as is the case in the comparison of subspecies of bluestem grasses. Insights from this study may also be relevant to other applications focused on transcripts of low expression levels.

Subject(s)

Andropogon/genetics , Genomics/methods , RNA, Plant/genetics , Sequence Analysis, RNA/methods , Transcriptome , Adaptation, Physiological/genetics , Phenotype

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL