Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Artif Intell Med ; 154: 102899, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38843692

ABSTRACT

Predictive modeling is becoming an essential tool for clinical decision support, but health systems with smaller sample sizes may construct suboptimal or overly specific models. Models become over-specific when beside true physiological effects, they also incorporate potentially volatile site-specific artifacts. These artifacts can change suddenly and can render the model unsafe. To obtain safer models, health systems with inadequate sample sizes may adopt one of the following options. First, they can use a generic model, such as one purchased from a vendor, but often such a model is not sufficiently specific to the patient population and is thus suboptimal. Second, they can participate in a research network. Paradoxically though, sites with smaller datasets contribute correspondingly less to the joint model, again rendering the final model suboptimal. Lastly, they can use transfer learning, starting from a model trained on a large data set and updating this model to the local population. This strategy can also result in a model that is over-specific. In this paper we present the consensus modeling paradigm, which uses the help of a large site (source) to reach a consensus model at the small site (target). We evaluate the approach on predicting postoperative complications at two health systems with 9,044 and 38,045 patients (rare outcomes at about 1% positive rate), and conduct a simulation study to understand the performance of consensus modeling relative to the other three approaches as a function of the available training sample size at the target site. We found that consensus modeling exhibited the least over-specificity at either the source or target site and achieved the highest combined predictive performance.

2.
Entropy (Basel) ; 26(3)2024 Mar 02.
Article in English | MEDLINE | ID: mdl-38539740

ABSTRACT

The knowledge of the causal mechanisms underlying one single system may not be sufficient to answer certain questions. One can gain additional insights from comparing and contrasting the causal mechanisms underlying multiple systems and uncovering consistent and distinct causal relationships. For example, discovering common molecular mechanisms among different diseases can lead to drug repurposing. The problem of comparing causal mechanisms among multiple systems is non-trivial, since the causal mechanisms are usually unknown and need to be estimated from data. If we estimate the causal mechanisms from data generated from different systems and directly compare them (the naive method), the result can be sub-optimal. This is especially true if the data generated by the different systems differ substantially with respect to their sample sizes. In this case, the quality of the estimated causal mechanisms for the different systems will differ, which can in turn affect the accuracy of the estimated similarities and differences among the systems via the naive method. To mitigate this problem, we introduced the bootstrap estimation and the equal sample size resampling estimation method for estimating the difference between causal networks. Both of these methods use resampling to assess the confidence of the estimation. We compared these methods with the naive method in a set of systematically simulated experimental conditions with a variety of network structures and sample sizes, and using different performance metrics. We also evaluated these methods on various real-world biomedical datasets covering a wide range of data designs.

4.
EBioMedicine ; 85: 104292, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36182774

ABSTRACT

BACKGROUND: The hard endpoint of death is one of the most significant outcomes in both clinical practice and research settings. Our goal was to discover direct causes of longevity from medically accessible data. METHODS: Using a framework that combines local causal discovery algorithms with discovery of maximally predictive and compact feature sets (the "Markov boundaries" of the response) and equivalence classes, we examined 186 variables and their relationships with survival over 27 years in 1507 participants, aged ≥71 years, of the longitudinal, community-based D-EPESE study. FINDINGS: As few as 8-15 variables predicted longevity at 2-, 5- and 10-years with predictive performance (area under receiver operator characteristic curve) of 0·76 (95% CIs 0·69, 0·83), 0·76 (0·72, 0·81) and 0·66 (0·61, 0·71), respectively. Numbers of small high-density lipoprotein particles, younger age, and fewer pack years of cigarette smoking were the strongest determinants of longevity at 2-, 5- and 10-years, respectively. Physical function was a prominent predictor of longevity at all time horizons. Age and cognitive function contributed to predictions at 5 and 10 years. Age was not among the local 2-year prediction variables (although significant in univariable analysis), thus establishing that age is not a direct cause of 2-year longevity in the context of measured factors in our data that determine longevity. INTERPRETATION: The discoveries in this study proceed from causal data science analyses of deep clinical and molecular phenotyping data in a community-based cohort of older adults with known lifespan. FUNDING: NIH/NIA R01AG054840, R01AG12765, and P30-AG028716, NIH/NIA Contract N01-AG-12102 and NCRR 1UL1TR002494-01.


Subject(s)
Exercise , Longevity , Humans , Aged , Cohort Studies
5.
IEEE J Biomed Health Inform ; 26(11): 5728-5737, 2022 11.
Article in English | MEDLINE | ID: mdl-36006882

ABSTRACT

A cornerstone of clinical medicine is intervening on a continuous exposure, such as titrating the dosage of a pharmaceutical or controlling a laboratory result. In clinical trials, continuous exposures are dichotomized into narrow ranges, excluding large portions of the realistic treatment scenarios. The existing computational methods for estimating the effect of continuous exposure rely on a set of strict assumptions. We introduce new methods that are more robust towards violations of these assumptions. Our methods are based on the key observation that changes of exposure in the clinical setting are often achieved gradually, so effect estimates must be "locally" robust in narrower exposure ranges. We compared our methods with several existing methods on three simulated studies with increasing complexity. We also applied the methods to data from 14 k sepsis patients at M Health Fairview to estimate the effect of antibiotic administration latency on prolonged hospital stay. The proposed methods achieve good performance in all simulation studies. When the assumptions were violated, the proposed methods had estimation errors of one half to one fifth of the state-of-the-art methods. Applying our methods to the sepsis cohort resulted in effect estimates consistent with clinical knowledge.


Subject(s)
Sepsis , Humans , Computer Simulation , Cohort Studies , Sepsis/diagnosis
6.
Crit Care Med ; 50(5): 799-809, 2022 05 01.
Article in English | MEDLINE | ID: mdl-34974496

ABSTRACT

OBJECTIVES: Sepsis remains a leading and preventable cause of hospital utilization and mortality in the United States. Despite updated guidelines, the optimal definition of sepsis as well as optimal timing of bundled treatment remain uncertain. Identifying patients with infection who benefit from early treatment is a necessary step for tailored interventions. In this study, we aimed to illustrate clinical predictors of time-to-antibiotics among patients with severe bacterial infection and model the effect of delay on risk-adjusted outcomes across different sepsis definitions. DESIGN: A multicenter retrospective observational study. SETTING: A seven-hospital network including academic tertiary care center. PATIENTS: Eighteen thousand three hundred fifteen patients admitted with severe bacterial illness with or without sepsis by either acute organ dysfunction (AOD) or systemic inflammatory response syndrome positivity. MEASUREMENTS AND MAIN RESULTS: The primary exposure was time to antibiotics. We identified patient predictors of time-to-antibiotics including demographics, chronic diagnoses, vitals, and laboratory results and determined the impact of delay on a composite of inhospital death or length of stay over 10 days. Distribution of time-to-antibiotics was similar across patients with and without sepsis. For all patients, a J-curve relationship between time-to-antibiotics and outcomes was observed, primarily driven by length of stay among patients without AOD. Patient characteristics provided good to excellent prediction of time-to-antibiotics irrespective of the presence of sepsis. Reduced time-to-antibiotics was associated with improved outcomes for all time points beyond 2.5 hours from presentation across sepsis definitions. CONCLUSIONS: Antibiotic timing is a function of patient factors regardless of sepsis criteria. Similarly, we show that early administration of antibiotics is associated with improved outcomes in all patients with severe bacterial illness. Our findings suggest identifying infection is a rate-limiting and actionable step that can improve outcomes in septic and nonseptic patients.


Subject(s)
Bacterial Infections , Sepsis , Shock, Septic , Anti-Bacterial Agents/therapeutic use , Bacterial Infections/drug therapy , Hospital Mortality , Hospitalization , Humans , Retrospective Studies , United States
7.
Prehosp Emerg Care ; 26(4): 556-565, 2022.
Article in English | MEDLINE | ID: mdl-34313534

ABSTRACT

Objective: A tiered trauma team activation system allocates resources proportional to patients' needs based upon injury burden. Previous trauma hospital-triage models are limited to predicting Injury Severity Score which is based on > 10% all-cause in-hospital mortality, rather than need for emergent intervention within 6 hours (NEI-6). Our aim was to develop a novel prediction model for hospital-triage that utilizes criteria available to the EMS provider to predict NEI-6 and the need for a trauma team activation.Methods: A regional trauma quality collaborative was used to identify all trauma patients ≥ 16 years from the American College of Surgeons-Committee on Trauma verified Level 1 and 2 trauma centers. Logistic regression and random forest were used to construct two predictive models for NEI-6 based on clinically relevant variables. Restricted cubic splines were used to model nonlinear predictors. The accuracy of the prediction model was assessed in terms of discrimination.Results: Using data from 12,624 patients for the training dataset (62.6% male; median age 61 years; median ISS 9) and 9,445 patients for the validation dataset (62.6% male; median age 59 years; median ISS 9), the following significant predictors were selected for the prediction models: age, gender, field GCS, vital signs, intentionality, and mechanism of injury. The final boosted tree model showed an AUC of 0.85 in the validation cohort for predicting NEI-6.Conclusions: The NEI-6 trauma triage prediction model used prehospital metrics to predict need for highest level of trauma activation. Prehospital prediction of major trauma may reduce undertriage mortality and improve resource utilization.


Subject(s)
Emergency Medical Services , Wounds and Injuries , Female , Hospitals , Humans , Injury Severity Score , Male , Middle Aged , Retrospective Studies , Trauma Centers , Triage , Wounds and Injuries/therapy
8.
J Am Med Inform Assoc ; 29(1): 72-79, 2021 12 28.
Article in English | MEDLINE | ID: mdl-34963141

ABSTRACT

OBJECTIVE: Hospital-acquired infections (HAIs) are associated with significant morbidity, mortality, and prolonged hospital length of stay. Risk prediction models based on pre- and intraoperative data have been proposed to assess the risk of HAIs at the end of the surgery, but the performance of these models lag behind HAI detection models based on postoperative data. Postoperative data are more predictive than pre- or interoperative data since it is closer to the outcomes in time, but it is unavailable when the risk models are applied (end of surgery). The objective is to study whether such data, which is temporally unavailable at prediction time (TUP) (and thus cannot directly enter the model), can be used to improve the performance of the risk model. MATERIALS AND METHODS: An extensive array of 12 methods based on logistic/linear regression and deep learning were used to incorporate the TUP data using a variety of intermediate representations of the data. Due to the hierarchical structure of different HAI outcomes, a comparison of single and multi-task learning frameworks is also presented. RESULTS AND DISCUSSION: The use of TUP data was always advantageous as baseline methods, which cannot utilize TUP data, never achieved the top performance. The relative performances of the different models vary across the different outcomes. Regarding the intermediate representation, we found that its complexity was key and that incorporating label information was helpful. CONCLUSIONS: Using TUP data significantly helped predictive performance irrespective of the model complexity.


Subject(s)
Cross Infection , Cross Infection/epidemiology , Hospitals , Humans , Logistic Models , Morbidity
9.
JAMIA Open ; 4(3): ooab055, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34350391

ABSTRACT

OBJECTIVE: Ensuring an efficient response to COVID-19 requires a degree of inter-system coordination and capacity management coupled with an accurate assessment of hospital utilization including length of stay (LOS). We aimed to establish optimal practices in inter-system data sharing and LOS modeling to support patient care and regional hospital operations. MATERIALS AND METHODS: We completed a retrospective observational study of patients admitted with COVID-19 followed by 12-week prospective validation, involving 36 hospitals covering the upper Midwest. We developed a method for sharing de-identified patient data across systems for analysis. From this, we compared 3 approaches, generalized linear model (GLM) and random forest (RF), and aggregated system level averages to identify features associated with LOS. We compared model performance by area under the ROC curve (AUROC). RESULTS: A total of 2068 patients were included and used for model derivation and 597 patients for validation. LOS overall had a median of 5.0 days and mean of 8.2 days. Consistent predictors of LOS included age, critical illness, oxygen requirement, weight loss, and nursing home admission. In the validation cohort, the RF model (AUROC 0.890) and GLM model (AUROC 0.864) achieved good to excellent prediction of LOS, but only marginally better than system averages in practice. CONCLUSION: Regional sharing of patient data allowed for effective prediction of LOS across systems; however, this only provided marginal improvement over hospital averages at the aggregate level. A federated approach of sharing aggregated system capacity and average LOS will likely allow for effective capacity management at the regional level.

10.
J Am Coll Surg ; 232(6): 963-971.e1, 2021 06.
Article in English | MEDLINE | ID: mdl-33831539

ABSTRACT

BACKGROUND: Surgical complications have tremendous consequences and costs. Complication detection is important for quality improvement, but traditional manual chart review is burdensome. Automated mechanisms are needed to make this more efficient. To understand the generalizability of a machine learning algorithm between sites, automated surgical site infection (SSI) detection algorithms developed at one center were tested at another distinct center. STUDY DESIGN: NSQIP patients had electronic health record (EHR) data extracted at one center (University of Minnesota Medical Center, Site A) over a 4-year period for model development and internal validation, and at a second center (University of California San Francisco, Site B) over a subsequent 2-year period for external validation. Models for automated NSQIP SSI detection of superficial, organ space, and total SSI within 30 days postoperatively were validated using area under the curve (AUC) scores and corresponding 95% confidence intervals. RESULTS: For the 8,883 patients (Site A) and 1,473 patients (Site B), AUC scores were not statistically different for any outcome including superficial (external 0.804, internal [0.784, 0.874] AUC); organ/space (external 0.905, internal [0.867, 0.941] AUC); and total (external 0.855, internal [0.854, 0.908] AUC) SSI. False negative rates decreased with increasing case review volume and would be amenable to a strategy in which cases with low predicted probabilities of SSI could be excluded from chart review. CONCLUSIONS: Our findings demonstrated that SSI detection machine learning algorithms developed at 1 site were generalizable to another institution. SSI detection models are practically applicable to accelerate and focus chart review.


Subject(s)
Electronic Health Records/statistics & numerical data , Machine Learning , Medical Audit/methods , Quality Improvement , Surgical Wound Infection/diagnosis , Adult , Aged , Datasets as Topic , Female , Hospitals/statistics & numerical data , Humans , Male , Medical Audit/statistics & numerical data , Middle Aged , Risk Factors , Surgical Wound Infection/epidemiology
11.
J Gen Intern Med ; 35(5): 1413-1418, 2020 05.
Article in English | MEDLINE | ID: mdl-32157649

ABSTRACT

BACKGROUND: Predicting death in a cohort of clinically diverse, multi-condition hospitalized patients is difficult. This frequently hinders timely serious illness care conversations. Prognostic models that can determine 6-month death risk at the time of hospital admission can improve access to serious illness care conversations. OBJECTIVE: The objective is to determine if the demographic, vital sign, and laboratory data from the first 48 h of a hospitalization can be used to accurately quantify 6-month mortality risk. DESIGN: This is a retrospective study using electronic medical record data linked with the state death registry. PARTICIPANTS: Participants were 158,323 hospitalized patients within a 6-hospital network over a 6-year period. MAIN MEASURES: Main measures are the following: the first set of vital signs, complete blood count, basic and complete metabolic panel, serum lactate, pro-BNP, troponin-I, INR, aPTT, demographic information, and associated ICD codes. The outcome of interest was death within 6 months. KEY RESULTS: Model performance was measured on the validation dataset. A random forest model-mini serious illness algorithm-used 8 variables from the initial 48 h of hospitalization and predicted death within 6 months with an AUC of 0.92 (0.91-0.93). Red cell distribution width was the most important prognostic variable. min-SIA (mini serious illness algorithm) was very well calibrated and estimated the probability of death to within 10% of the actual value. The discriminative ability of the min-SIA was significantly better than historical estimates of clinician performance. CONCLUSION: min-SIA algorithm can identify patients at high risk of 6-month mortality at the time of hospital admission. It can be used to improved access to timely, serious illness care conversations in high-risk patients.


Subject(s)
Algorithms , Hospitalization , Cohort Studies , Hospital Mortality , Hospitals , Humans , Retrospective Studies , Risk Assessment
12.
Clin Cancer Res ; 26(1): 213-219, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31527166

ABSTRACT

PURPOSE: Predicting surgical outcome could improve individualizing treatment strategies for patients with advanced ovarian cancer. It has been suggested earlier that gene expression signatures (GES) might harbor the potential to predict surgical outcome. EXPERIMENTAL DESIGN: Data derived from high-grade serous tumor tissue of FIGO stage IIIC/IV patients of AGO-OVAR11 trial were used to generate a transcriptome profiling. Previously identified molecular signatures were tested. A theoretical model was implemented to evaluate the impact of medically associated factors for residual disease (RD) on the performance of GES that predicts RD status. RESULTS: A total of 266 patients met inclusion criteria, of those, 39.1% underwent complete resection. Previously reported GES did not predict RD in this cohort. Similarly, The Cancer Genome Atlas molecular subtypes, an independent de novo signature and the total gene expression dataset using all 21,000 genes were not able to predict RD status. Medical reasons for RD were identified as potential limiting factors that impact the ability to use GES to predict RD. In a center with high complete resection rates, a GES which would perfectly predict tumor biological RD would have a performance of only AUC 0.83, due to reasons other than tumor biology. CONCLUSIONS: Previously identified GES cannot be generalized. Medically associated factors for RD may be the main obstacle to predict surgical outcome in an all-comer population of patients with advanced ovarian cancer. If biomarkers derived from tumor tissue are used to predict outcome of patients with cancer, selection bias should be focused on to prevent overestimation of the power of such a biomarker.See related commentary by Handley and Sood, p. 9.


Subject(s)
Carcinoma, Ovarian Epithelial , Ovarian Neoplasms , Biomarkers , Cytoreduction Surgical Procedures , Female , Humans , Neoplasm Staging
13.
Stud Health Technol Inform ; 264: 398-402, 2019 Aug 21.
Article in English | MEDLINE | ID: mdl-31437953

ABSTRACT

Surgical procedures carry the risk of postoperative infectious complications, which can be severe, expensive, and morbid. A growing body of evidence indicates that high-resolution intraoperative data can be predictive of these complications. However, these studies are often contradictory in their findings as well as difficult to replicate, suggesting that these predictive models may be capturing institutional artifacts. In this work, data and models from two independent institutions, Mayo Clinic and University of Minnesota-affiliated Fairview Health Services, were directly compared using a common set of definitions for the variables and outcomes. We built perioperative risk models for seven infectious post-surgical complications at each site to assess the value of intraoperative variables. Models were internally validated. We found that including intraoperative variables significantly improved the models' predictive performance at both sites for five out of seven complications. We also found that significant intraoperative variables were similar between the two sites for four of the seven complications. Our results suggest that intraoperative variables can be related to the underlying physiology for some infectious complications.


Subject(s)
Communicable Diseases , Humans , Postoperative Complications , Retrospective Studies
14.
AMIA Annu Symp Proc ; 2018: 1093-1102, 2018.
Article in English | MEDLINE | ID: mdl-30815151

ABSTRACT

We report recent progress in the development of a precision test for individualized use of the VEGF-A targeting drug bevacizumab for treating ovarian cancer. We discuss the discovery model stage (i.e., past feasibility modeling and before conversion to the production test). Main results: (a) Informatics modeling plays a critical role in supporting driving clinical and health economic requirements. (b) The novel computational models support the creation of a precision test with sufficient predictivity to reduce healthcare system costs up to $30 billion over 10 years, and make the use of bevacizumab affordable without loss of length or quality of life.


Subject(s)
Antineoplastic Agents, Immunological/therapeutic use , Bevacizumab/therapeutic use , Carcinoma, Ovarian Epithelial/drug therapy , Computational Biology , Molecular Targeted Therapy , Ovarian Neoplasms/drug therapy , Precision Medicine/methods , Therapy, Computer-Assisted , Computer Simulation , Cost Savings , Data Science , Delivery of Health Care/economics , Female , Humans , Kaplan-Meier Estimate , Models, Biological , Molecular Targeted Therapy/economics , Quality of Life
SELECTION OF CITATIONS
SEARCH DETAIL
...