RESUMO
INTRODUCTION AND OBJECTIVES: With rising prevalence of pre-sarcopenia in metabolic dysfunction-associated steatotic liver disease (MASLD), this study aimed to develop and validate machine learning-based model to identify pre-sarcopenia in MASLD population. MATERIALS AND METHODS: A total of 571 MASLD subjects were screened from the National Health and Nutrition Examination Survey 2017-2018. This cohort was randomly divided into training set and internal testing set with a ratio of 7:3. Sixty-six MASLD subjects were collected from our institution as external validation set. Four binary classifiers, including Random Forest (RF), support vector machine, and extreme gradient boosting and logistic regression, were fitted to identify pre-sarcopenia. The best-performing model was further validated in external validation set. Model performance was assessed in terms of discrimination and calibration. Shapley Additive explanations were used for model interpretability. RESULTS: The pre-sarcopenia rate was 17.51 % and 15.16 % in NHANES cohort and external validation set, respectively. RF outperformed other models with area under receiver operating characteristic curve (AUROC) of 0.819 (95 %CI: 0.749, 0.889). When six top-ranking features were retained as per variable importance, including weight-adjusted waist, sex, race, creatinine, education and alkaline phosphatase, a final RF model reached an AUROC being 0.824 (0.737, 0.910) and 0.732 (95 %CI: 0.529, 0.936) in internal and external validation sets, respectively. The model robustness was proved in sensitivity analysis. The calibration curve and decision curve analysis confirmed a good calibration capacity and good clinical usage. CONCLUSIONS: This study proposed a user-friendly model using explainable machine learning algorithm to predict pre-sarcopenia in MASLD population. A web-based tool was provided to screening pre-sarcopenia in community and hospitalization settings.