Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
STAR Protoc ; 4(2): 102302, 2023 May 12.
Article in English | MEDLINE | ID: mdl-37178115

ABSTRACT

The AutoScore framework can automatically generate data-driven clinical scores in various clinical applications. Here, we present a protocol for developing clinical scoring systems for binary, survival, and ordinal outcomes using the open-source AutoScore package. We describe steps for package installation, detailed data processing and checking, and variable ranking. We then explain how to iterate through steps for variable selection, score generation, fine-tuning, and evaluation to generate understandable and explainable scoring systems using data-driven evidence and clinical knowledge. For complete details on the use and execution of this protocol, please refer to Xie et al. (2020),1 Xie et al. (2022)2, Saffari et al. (2022)3 and the online tutorial https://nliulab.github.io/AutoScore/.

2.
World Neurosurg ; 161: 284-290, 2022 05.
Article in English | MEDLINE | ID: mdl-35505546

ABSTRACT

BACKGROUND: Missing data is a typical problem in clinical studies, where the value of variables of interest is not measured or collected for some patients. This article aimed to review imputation approaches for missing values and their application in neurosurgery. METHODS: We reviewed current practices on detecting missingness patterns and applications of multiple imputation approaches under different scenarios. Statistical considerations and importance of sensitivity analysis were explained. Various imputation methods were applied to a retrospective cohort. RESULTS: For illustration purposes, a retrospective cohort of 609 patients harboring both ruptured and unruptured intracranial aneurysms and undergoing microsurgical clip reconstruction at Erasmus MC University Medical Center, Rotterdam, The Netherlands, between 2000 and 2019 was used. modified Rankin Scale score at 6 months was the clinical outcome, and potential predictors were age, sex, size of aneurysm, hypertension, smoking, World Federation of Neurosurgical Societies grade, and aneurysm location. Associations were investigated using different imputation approaches, and the results were compared and discussed. CONCLUSIONS: Missing values should be treated carefully. Advantages and disadvantages of multiple imputation methods along with imputation in small and big data should be considered depending on the research question and specifics of the study.


Subject(s)
Intracranial Aneurysm , Cohort Studies , Data Interpretation, Statistical , Humans , Intracranial Aneurysm/surgery , Netherlands , Retrospective Studies
3.
Patterns (N Y) ; 3(4): 100452, 2022 Apr 08.
Article in English | MEDLINE | ID: mdl-35465224

ABSTRACT

Interpretable machine learning has been focusing on explaining final models that optimize performance. The state-of-the-art Shapley additive explanations (SHAP) locally explains the variable impact on individual predictions and has recently been extended to provide global assessments across the dataset. Our work further extends "global" assessments to a set of models that are "good enough" and are practically as relevant as the final model to a prediction task. The resulting Shapley variable importance cloud consists of Shapley-based importance measures from each good model and pools information across models to provide an overall importance measure, with uncertainty explicitly quantified to support formal statistical inference. We developed visualizations to highlight the uncertainty and to illustrate its implications to practical inference. Building on a common theoretical basis, our method seamlessly complements the widely adopted SHAP assessments of a single final model to avoid biased inference, which we demonstrate in two experiments using recidivism prediction data and clinical data.

4.
J Biomed Inform ; 129: 104072, 2022 05.
Article in English | MEDLINE | ID: mdl-35421602

ABSTRACT

BACKGROUND: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among various decision-making models to determine the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. However, its current framework still leaves room for improvement when addressing unbalanced data of rare events. METHODS: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. Baseline techniques for performance comparison included the original AutoScore, full logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), full random forest, and random forest with a reduced number of variables. These models were evaluated based on their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches to predict inpatient mortality. RESULTS: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839), while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.801). The AutoScore-Imbalance sub-model (using a down-sampling algorithm) yielded an AUC of 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Furthermore, AutoScore-Imbalance obtained the highest balanced accuracy of 0.757 (0.702-0.805), compared to 0.698 (0.643-0.753) by the original AutoScore and the maximum of 0.720 (0.664-0.769) by other baseline models. CONCLUSIONS: We have developed an interpretable tool to handle clinical data imbalance, presented its structure, and demonstrated its superiority over baselines. The AutoScore-Imbalance tool can be applied to highly unbalanced datasets to gain further insight into rare medical events and facilitate real-world clinical decision-making.


Subject(s)
Algorithms , Machine Learning , Clinical Decision-Making , Logistic Models , ROC Curve
5.
J Biomed Inform ; 125: 103959, 2022 01.
Article in English | MEDLINE | ID: mdl-34826628

ABSTRACT

BACKGROUND: Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. METHODS: AutoScore was previously developed as an interpretable machine learning score generator, integrating both machine learning and point-based scores in the strong discriminability and accessibility. We have further extended it to the time-to-event outcomes and developed AutoScore-Survival, for generating time-to-event scores with right-censored survival data. Random survival forest provided an efficient solution for selecting variables, and Cox regression was used for score weighting. We implemented our proposed method as an R package. We illustrated our method in a study of 90-day survival prediction for patients in intensive care units and compared its performance with other survival models, the random survival forest, and two traditional clinical scores. RESULTS: The AutoScore-Survival-derived scoring system was more parsimonious than survival models built using traditional variable selection methods (e.g., penalized likelihood approach and stepwise variable selection), and its performance was comparable to survival models using the same set of variables. Although AutoScore-Survival achieved a comparable integrated area under the curve of 0.782 (95% CI: 0.767-0.794), the integer-valued time-to-event scores generated are favorable in clinical applications because they are easier to compute and interpret. CONCLUSIONS: Our proposed AutoScore-Survival provides a robust and easy-to-use machine learning-based clinical score generator to studies of time-to-event outcomes. It gives a systematic guideline to facilitate the future development of time-to-event scores for clinical applications.


Subject(s)
Machine Learning , Humans , Likelihood Functions
6.
JAMA Netw Open ; 4(8): e2118467, 2021 08 02.
Article in English | MEDLINE | ID: mdl-34448870

ABSTRACT

Importance: Triage in the emergency department (ED) is a complex clinical judgment based on the tacit understanding of the patient's likelihood of survival, availability of medical resources, and local practices. Although a scoring tool could be valuable in risk stratification, currently available scores have demonstrated limitations. Objectives: To develop an interpretable machine learning tool based on a parsimonious list of variables available at ED triage; provide a simple, early, and accurate estimate of patients' risk of death; and evaluate the tool's predictive accuracy compared with several established clinical scores. Design, Setting, and Participants: This single-site, retrospective cohort study assessed all ED patients between January 1, 2009, and December 31, 2016, who were subsequently admitted to a tertiary hospital in Singapore. The Score for Emergency Risk Prediction (SERP) tool was derived using a machine learning framework. To estimate mortality outcomes after emergency admissions, SERP was compared with several triage systems, including Patient Acuity Category Scale, Modified Early Warning Score, National Early Warning Score, Cardiac Arrest Risk Triage, Rapid Acute Physiology Score, and Rapid Emergency Medicine Score. The initial analyses were completed in October 2020, and additional analyses were conducted in May 2021. Main Outcomes and Measures: Three SERP scores, namely SERP-2d, SERP-7d, and SERP-30d, were developed using the primary outcomes of interest of 2-, 7-, and 30-day mortality, respectively. Secondary outcomes included 3-day mortality and inpatient mortality. The SERP's predictive power was measured using the area under the curve in the receiver operating characteristic analysis. Results: The study included 224 666 ED episodes in the model training cohort (mean [SD] patient age, 63.60 [16.90] years; 113 426 [50.5%] female), 56 167 episodes in the validation cohort (mean [SD] patient age, 63.58 [16.87] years; 28 427 [50.6%] female), and 42 676 episodes in the testing cohort (mean [SD] patient age, 64.85 [16.80] years; 21 556 [50.5%] female). The mortality rates in the training cohort were 0.8% at 2 days, 2.2% at 7 days, and 5.9% at 30 days. In the testing cohort, the areas under the curve of SERP-30d were 0.821 (95% CI, 0.796-0.847) for 2-day mortality, 0.826 (95% CI, 0.811-0.841) for 7-day mortality, and 0.823 (95% CI, 0.814-0.832) for 30-day mortality and outperformed several benchmark scores. Conclusions and Relevance: In this retrospective cohort study, SERP had better prediction performance than existing triage scores while maintaining easy implementation and ease of ascertainment in the ED. It has the potential to be widely applied and validated in different circumstances and health care settings.


Subject(s)
Emergency Service, Hospital/statistics & numerical data , Machine Learning , Patient Acuity , Patient Admission/statistics & numerical data , Risk Assessment/methods , Aged , Benchmarking , Female , Humans , Male , Middle Aged , Predictive Value of Tests , Reproducibility of Results , Retrospective Studies , Risk Factors , Singapore , Tertiary Care Centers , Triage
7.
JMIR Med Inform ; 8(10): e21798, 2020 Oct 21.
Article in English | MEDLINE | ID: mdl-33084589

ABSTRACT

BACKGROUND: Risk scores can be useful in clinical risk stratification and accurate allocations of medical resources, helping health providers improve patient care. Point-based scores are more understandable and explainable than other complex models and are now widely used in clinical decision making. However, the development of the risk scoring model is nontrivial and has not yet been systematically presented, with few studies investigating methods of clinical score generation using electronic health records. OBJECTIVE: This study aims to propose AutoScore, a machine learning-based automatic clinical score generator consisting of 6 modules for developing interpretable point-based scores. Future users can employ the AutoScore framework to create clinical scores effortlessly in various clinical applications. METHODS: We proposed the AutoScore framework comprising 6 modules that included variable ranking, variable transformation, score derivation, model selection, score fine-tuning, and model evaluation. To demonstrate the performance of AutoScore, we used data from the Beth Israel Deaconess Medical Center to build a scoring model for mortality prediction and then compared the data with other baseline models using the receiver operating characteristic analysis. A software package in R 3.5.3 (R Foundation) was also developed to demonstrate the implementation of AutoScore. RESULTS: Implemented on the data set with 44,918 individual admission episodes of intensive care, the AutoScore-created scoring models performed comparably well as other standard methods (ie, logistic regression, stepwise regression, least absolute shrinkage and selection operator, and random forest) in terms of predictive accuracy and model calibration but required fewer predictors and presented high interpretability and accessibility. The nine-variable, AutoScore-created, point-based scoring model achieved an area under the curve (AUC) of 0.780 (95% CI 0.764-0.798), whereas the model of logistic regression with 24 variables had an AUC of 0.778 (95% CI 0.760-0.795). Moreover, the AutoScore framework also drives the clinical research continuum and automation with its integration of all necessary modules. CONCLUSIONS: We developed an easy-to-use, machine learning-based automatic clinical score generator, AutoScore; systematically presented its structure; and demonstrated its superiority (predictive performance and interpretability) over other conventional methods using a benchmark database. AutoScore will emerge as a potential scoring tool in various medical applications.

8.
J Acad Nutr Diet ; 115(4): 537-50.e2, 2015 Apr.
Article in English | MEDLINE | ID: mdl-25578925

ABSTRACT

BACKGROUND: Latino immigrants have high rates of obesity and face barriers to weight loss. OBJECTIVE: To evaluate the effectiveness of a case-management (CM) intervention with and without community health workers (CHWs) for weight loss. DESIGN: This was a 2-year, randomized controlled trial comparing two interventions with each other and with usual care (UC). PARTICIPANTS/SETTING: Eligible participants included Latinos with a body mass index of 30 to 60 and one or more heart disease risk factors. The 207 participants recruited during 2009-2010 had a mean age of 47 years and were mostly women (77%). At 24 months, 86% of the sample was assessed. INTERVENTION: The CM+CHW (n=82) and CM (n=84) interventions were compared with each other and with UC (n=41). Both included an intensive 12-month phase followed by 12 months of maintenance. The CM+CHW group received home visits. MAIN OUTCOME MEASURES: Weight change at 24 months. STATISTICAL ANALYSES: Generalized estimating equations using intent-to-treat. RESULTS: At 6 months, mean weight loss in the CM+CHW arm was -2.1 kg (95% CI -2.8 to -1.3) or -2% of baseline weight (95% CI -1% to -2%) compared with -1.6 kg (95% CI -2.4 to -0.7; % weight change, -2%, -1%, and -3%) in CM and -0.9 kg (95% CI -1.8 to 0.1; % weight change, -1%, 0%, and -2%) in UC. By 12 and 24 months, differences narrowed and CM+CHW was no longer statistically distinct. Men achieved greater weight loss than women in all groups at each time point (P<0.05). At 6 months, men in the CM+CHW arm lost more weight (-4.4 kg; 95% CI -6.0 to -2.7) compared with UC (-0.4 kg; 95% CI -2.4 to 1.5), but by 12 and 24 months differences were not significant. CONCLUSIONS: This study demonstrated that incorporation of CHWs may help promote initial weight loss, especially among men, but not weight maintenance. Additional strategies to address social and environmental influences may be needed for Latino immigrant populations.


Subject(s)
Community Health Services , Hispanic or Latino , Obesity/therapy , Poverty , Weight Loss , Adult , Blood Glucose , Blood Pressure , Body Mass Index , Community Health Workers , Diabetes Mellitus, Type 2 , Female , Humans , Male , Middle Aged , Sex Factors , Treatment Outcome , United States , Waist Circumference
SELECTION OF CITATIONS
SEARCH DETAIL
...