Pesquisa | Portal Regional da BVS (teste)

A universal AutoScore framework to develop interpretable scoring systems for predicting common types of clinical outcomes.

Xie, Feng; Ning, Yilin; Liu, Mingxuan; Li, Siqi; Saffari, Seyed Ehsan; Yuan, Han; Volovici, Victor; Ting, Daniel Shu Wei; Goldstein, Benjamin Alan; Ong, Marcus Eng Hock; Vaughan, Roger; Chakraborty, Bibhas; Liu, Nan.

STAR Protoc ; 4(2): 102302, 2023 May 12.

Artigo em Inglês | MEDLINE | ID: mdl-37178115

RESUMO

The AutoScore framework can automatically generate data-driven clinical scores in various clinical applications. Here, we present a protocol for developing clinical scoring systems for binary, survival, and ordinal outcomes using the open-source AutoScore package. We describe steps for package installation, detailed data processing and checking, and variable ranking. We then explain how to iterate through steps for variable selection, score generation, fine-tuning, and evaluation to generate understandable and explainable scoring systems using data-driven evidence and clinical knowledge. For complete details on the use and execution of this protocol, please refer to Xie et al. (2020),1 Xie et al. (2022)2, Saffari et al. (2022)3 and the online tutorial https://nliulab.github.io/AutoScore/.

Proper Use of Multiple Imputation and Dealing with Missing Covariate Data.

Saffari, Seyed Ehsan; Volovici, Victor; Ong, Marcus Eng Hock; Goldstein, Benjamin Alan; Vaughan, Roger; Dammers, Ruben; Steyerberg, Ewout W; Liu, Nan.

World Neurosurg ; 161: 284-290, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35505546

RESUMO

BACKGROUND: Missing data is a typical problem in clinical studies, where the value of variables of interest is not measured or collected for some patients. This article aimed to review imputation approaches for missing values and their application in neurosurgery. METHODS: We reviewed current practices on detecting missingness patterns and applications of multiple imputation approaches under different scenarios. Statistical considerations and importance of sensitivity analysis were explained. Various imputation methods were applied to a retrospective cohort. RESULTS: For illustration purposes, a retrospective cohort of 609 patients harboring both ruptured and unruptured intracranial aneurysms and undergoing microsurgical clip reconstruction at Erasmus MC University Medical Center, Rotterdam, The Netherlands, between 2000 and 2019 was used. modified Rankin Scale score at 6 months was the clinical outcome, and potential predictors were age, sex, size of aneurysm, hypertension, smoking, World Federation of Neurosurgical Societies grade, and aneurysm location. Associations were investigated using different imputation approaches, and the results were compared and discussed. CONCLUSIONS: Missing values should be treated carefully. Advantages and disadvantages of multiple imputation methods along with imputation in small and big data should be considered depending on the research question and specifics of the study.

Assuntos

Aneurisma Intracraniano , Estudos de Coortes , Interpretação Estatística de Dados , Humanos , Aneurisma Intracraniano/cirurgia , Países Baixos , Estudos Retrospectivos

Shapley variable importance cloud for interpretable machine learning.

Ning, Yilin; Ong, Marcus Eng Hock; Chakraborty, Bibhas; Goldstein, Benjamin Alan; Ting, Daniel Shu Wei; Vaughan, Roger; Liu, Nan.

Patterns (N Y) ; 3(4): 100452, 2022 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-35465224

RESUMO

Interpretable machine learning has been focusing on explaining final models that optimize performance. The state-of-the-art Shapley additive explanations (SHAP) locally explains the variable impact on individual predictions and has recently been extended to provide global assessments across the dataset. Our work further extends "global" assessments to a set of models that are "good enough" and are practically as relevant as the final model to a prediction task. The resulting Shapley variable importance cloud consists of Shapley-based importance measures from each good model and pools information across models to provide an overall importance measure, with uncertainty explicitly quantified to support formal statistical inference. We developed visualizations to highlight the uncertainty and to illustrate its implications to practical inference. Building on a common theoretical basis, our method seamlessly complements the widely adopted SHAP assessments of a single final model to avoid biased inference, which we demonstrate in two experiments using recidivism prediction data and clinical data.

AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data.

Yuan, Han; Xie, Feng; Ong, Marcus Eng Hock; Ning, Yilin; Chee, Marcel Lucas; Saffari, Seyed Ehsan; Abdullah, Hairil Rizal; Goldstein, Benjamin Alan; Chakraborty, Bibhas; Liu, Nan.

J Biomed Inform ; 129: 104072, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35421602

RESUMO

BACKGROUND: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among various decision-making models to determine the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. However, its current framework still leaves room for improvement when addressing unbalanced data of rare events. METHODS: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. Baseline techniques for performance comparison included the original AutoScore, full logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), full random forest, and random forest with a reduced number of variables. These models were evaluated based on their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches to predict inpatient mortality. RESULTS: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839), while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.801). The AutoScore-Imbalance sub-model (using a down-sampling algorithm) yielded an AUC of 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Furthermore, AutoScore-Imbalance obtained the highest balanced accuracy of 0.757 (0.702-0.805), compared to 0.698 (0.643-0.753) by the original AutoScore and the maximum of 0.720 (0.664-0.769) by other baseline models. CONCLUSIONS: We have developed an interpretable tool to handle clinical data imbalance, presented its structure, and demonstrated its superiority over baselines. The AutoScore-Imbalance tool can be applied to highly unbalanced datasets to gain further insight into rare medical events and facilitate real-world clinical decision-making.

Assuntos

Algoritmos , Aprendizado de Máquina , Tomada de Decisão Clínica , Modelos Logísticos , Curva ROC

AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data.

Xie, Feng; Ning, Yilin; Yuan, Han; Goldstein, Benjamin Alan; Ong, Marcus Eng Hock; Liu, Nan; Chakraborty, Bibhas.

J Biomed Inform ; 125: 103959, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-34826628

RESUMO

BACKGROUND: Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. METHODS: AutoScore was previously developed as an interpretable machine learning score generator, integrating both machine learning and point-based scores in the strong discriminability and accessibility. We have further extended it to the time-to-event outcomes and developed AutoScore-Survival, for generating time-to-event scores with right-censored survival data. Random survival forest provided an efficient solution for selecting variables, and Cox regression was used for score weighting. We implemented our proposed method as an R package. We illustrated our method in a study of 90-day survival prediction for patients in intensive care units and compared its performance with other survival models, the random survival forest, and two traditional clinical scores. RESULTS: The AutoScore-Survival-derived scoring system was more parsimonious than survival models built using traditional variable selection methods (e.g., penalized likelihood approach and stepwise variable selection), and its performance was comparable to survival models using the same set of variables. Although AutoScore-Survival achieved a comparable integrated area under the curve of 0.782 (95% CI: 0.767-0.794), the integer-valued time-to-event scores generated are favorable in clinical applications because they are easier to compute and interpret. CONCLUSIONS: Our proposed AutoScore-Survival provides a robust and easy-to-use machine learning-based clinical score generator to studies of time-to-event outcomes. It gives a systematic guideline to facilitate the future development of time-to-event scores for clinical applications.

Assuntos

Aprendizado de Máquina , Humanos , Funções Verossimilhança

Development and Assessment of an Interpretable Machine Learning Triage Tool for Estimating Mortality After Emergency Admissions.

Xie, Feng; Ong, Marcus Eng Hock; Liew, Johannes Nathaniel Min Hui; Tan, Kenneth Boon Kiat; Ho, Andrew Fu Wah; Nadarajan, Gayathri Devi; Low, Lian Leng; Kwan, Yu Heng; Goldstein, Benjamin Alan; Matchar, David Bruce; Chakraborty, Bibhas; Liu, Nan.

JAMA Netw Open ; 4(8): e2118467, 2021 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-34448870

RESUMO

Importance: Triage in the emergency department (ED) is a complex clinical judgment based on the tacit understanding of the patient's likelihood of survival, availability of medical resources, and local practices. Although a scoring tool could be valuable in risk stratification, currently available scores have demonstrated limitations. Objectives: To develop an interpretable machine learning tool based on a parsimonious list of variables available at ED triage; provide a simple, early, and accurate estimate of patients' risk of death; and evaluate the tool's predictive accuracy compared with several established clinical scores. Design, Setting, and Participants: This single-site, retrospective cohort study assessed all ED patients between January 1, 2009, and December 31, 2016, who were subsequently admitted to a tertiary hospital in Singapore. The Score for Emergency Risk Prediction (SERP) tool was derived using a machine learning framework. To estimate mortality outcomes after emergency admissions, SERP was compared with several triage systems, including Patient Acuity Category Scale, Modified Early Warning Score, National Early Warning Score, Cardiac Arrest Risk Triage, Rapid Acute Physiology Score, and Rapid Emergency Medicine Score. The initial analyses were completed in October 2020, and additional analyses were conducted in May 2021. Main Outcomes and Measures: Three SERP scores, namely SERP-2d, SERP-7d, and SERP-30d, were developed using the primary outcomes of interest of 2-, 7-, and 30-day mortality, respectively. Secondary outcomes included 3-day mortality and inpatient mortality. The SERP's predictive power was measured using the area under the curve in the receiver operating characteristic analysis. Results: The study included 224â¯666 ED episodes in the model training cohort (mean [SD] patient age, 63.60 [16.90] years; 113 426 [50.5%] female), 56â¯167 episodes in the validation cohort (mean [SD] patient age, 63.58 [16.87] years; 28 427 [50.6%] female), and 42â¯676 episodes in the testing cohort (mean [SD] patient age, 64.85 [16.80] years; 21 556 [50.5%] female). The mortality rates in the training cohort were 0.8% at 2 days, 2.2% at 7 days, and 5.9% at 30 days. In the testing cohort, the areas under the curve of SERP-30d were 0.821 (95% CI, 0.796-0.847) for 2-day mortality, 0.826 (95% CI, 0.811-0.841) for 7-day mortality, and 0.823 (95% CI, 0.814-0.832) for 30-day mortality and outperformed several benchmark scores. Conclusions and Relevance: In this retrospective cohort study, SERP had better prediction performance than existing triage scores while maintaining easy implementation and ease of ascertainment in the ED. It has the potential to be widely applied and validated in different circumstances and health care settings.

Assuntos

Serviço Hospitalar de Emergência/estatística & dados numéricos , Aprendizado de Máquina , Gravidade do Paciente , Admissão do Paciente/estatística & dados numéricos , Medição de Risco/métodos , Idoso , Benchmarking , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Estudos Retrospectivos , Fatores de Risco , Singapura , Centros de Atenção Terciária , Triagem

AutoScore: A Machine Learning-Based Automatic Clinical Score Generator and Its Application to Mortality Prediction Using Electronic Health Records.

Xie, Feng; Chakraborty, Bibhas; Ong, Marcus Eng Hock; Goldstein, Benjamin Alan; Liu, Nan.

JMIR Med Inform ; 8(10): e21798, 2020 Oct 21.

Artigo em Inglês | MEDLINE | ID: mdl-33084589

RESUMO

BACKGROUND: Risk scores can be useful in clinical risk stratification and accurate allocations of medical resources, helping health providers improve patient care. Point-based scores are more understandable and explainable than other complex models and are now widely used in clinical decision making. However, the development of the risk scoring model is nontrivial and has not yet been systematically presented, with few studies investigating methods of clinical score generation using electronic health records. OBJECTIVE: This study aims to propose AutoScore, a machine learning-based automatic clinical score generator consisting of 6 modules for developing interpretable point-based scores. Future users can employ the AutoScore framework to create clinical scores effortlessly in various clinical applications. METHODS: We proposed the AutoScore framework comprising 6 modules that included variable ranking, variable transformation, score derivation, model selection, score fine-tuning, and model evaluation. To demonstrate the performance of AutoScore, we used data from the Beth Israel Deaconess Medical Center to build a scoring model for mortality prediction and then compared the data with other baseline models using the receiver operating characteristic analysis. A software package in R 3.5.3 (R Foundation) was also developed to demonstrate the implementation of AutoScore. RESULTS: Implemented on the data set with 44,918 individual admission episodes of intensive care, the AutoScore-created scoring models performed comparably well as other standard methods (ie, logistic regression, stepwise regression, least absolute shrinkage and selection operator, and random forest) in terms of predictive accuracy and model calibration but required fewer predictors and presented high interpretability and accessibility. The nine-variable, AutoScore-created, point-based scoring model achieved an area under the curve (AUC) of 0.780 (95% CI 0.764-0.798), whereas the model of logistic regression with 24 variables had an AUC of 0.778 (95% CI 0.760-0.795). Moreover, the AutoScore framework also drives the clinical research continuum and automation with its integration of all necessary modules. CONCLUSIONS: We developed an easy-to-use, machine learning-based automatic clinical score generator, AutoScore; systematically presented its structure; and demonstrated its superiority (predictive performance and interpretability) over other conventional methods using a benchmark database. AutoScore will emerge as a potential scoring tool in various medical applications.

The effectiveness of two community-based weight loss strategies among obese, low-income US Latinos.

Rosas, Lisa Goldman; Thiyagarajan, Sreedevi; Goldstein, Benjamin Alan; Drieling, Rebecca Lucia; Romero, Priscilla Padilla; Ma, Jun; Yank, Veronica; Stafford, Randall Scott.

J Acad Nutr Diet ; 115(4): 537-50.e2, 2015 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-25578925

RESUMO

BACKGROUND: Latino immigrants have high rates of obesity and face barriers to weight loss. OBJECTIVE: To evaluate the effectiveness of a case-management (CM) intervention with and without community health workers (CHWs) for weight loss. DESIGN: This was a 2-year, randomized controlled trial comparing two interventions with each other and with usual care (UC). PARTICIPANTS/SETTING: Eligible participants included Latinos with a body mass index of 30 to 60 and one or more heart disease risk factors. The 207 participants recruited during 2009-2010 had a mean age of 47 years and were mostly women (77%). At 24 months, 86% of the sample was assessed. INTERVENTION: The CM+CHW (n=82) and CM (n=84) interventions were compared with each other and with UC (n=41). Both included an intensive 12-month phase followed by 12 months of maintenance. The CM+CHW group received home visits. MAIN OUTCOME MEASURES: Weight change at 24 months. STATISTICAL ANALYSES: Generalized estimating equations using intent-to-treat. RESULTS: At 6 months, mean weight loss in the CM+CHW arm was -2.1 kg (95% CI -2.8 to -1.3) or -2% of baseline weight (95% CI -1% to -2%) compared with -1.6 kg (95% CI -2.4 to -0.7; % weight change, -2%, -1%, and -3%) in CM and -0.9 kg (95% CI -1.8 to 0.1; % weight change, -1%, 0%, and -2%) in UC. By 12 and 24 months, differences narrowed and CM+CHW was no longer statistically distinct. Men achieved greater weight loss than women in all groups at each time point (P<0.05). At 6 months, men in the CM+CHW arm lost more weight (-4.4 kg; 95% CI -6.0 to -2.7) compared with UC (-0.4 kg; 95% CI -2.4 to 1.5), but by 12 and 24 months differences were not significant. CONCLUSIONS: This study demonstrated that incorporation of CHWs may help promote initial weight loss, especially among men, but not weight maintenance. Additional strategies to address social and environmental influences may be needed for Latino immigrant populations.

Assuntos

Serviços de Saúde Comunitária , Hispânico ou Latino , Obesidade/terapia , Pobreza , Redução de Peso , Adulto , Glicemia , Pressão Sanguínea , Índice de Massa Corporal , Agentes Comunitários de Saúde , Diabetes Mellitus Tipo 2 , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Fatores Sexuais , Resultado do Tratamento , Estados Unidos , Circunferência da Cintura

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA