RESUMO
OBJECTIVE: This study explores the prediction of near-term suicidal behavior using machine learning (ML) analyses of the Suicide Crisis Inventory (SCI), which measures the Suicide Crisis Syndrome, a presuicidal mental state. METHODS: SCI data were collected from high-risk psychiatric inpatients (N = 591) grouped based on their short-term suicidal behavior, that is, those who attempted suicide between intake and 1-month follow-up dates (N = 20) and those who did not (N = 571). Data were analyzed using three predictive algorithms (logistic regression, random forest, and gradient boosting) and three sampling approaches (split sample, Synthetic minority oversampling technique, and enhanced bootstrap). RESULTS: The enhanced bootstrap approach considerably outperformed the other sampling approaches, with random forest (98.0% precision; 33.9% recall; 71.0% Area under the precision-recall curve [AUPRC]; and 87.8% Area under the receiver operating characteristic [AUROC]) and gradient boosting (94.0% precision; 48.9% recall; 70.5% AUPRC; and 89.4% AUROC) algorithms performing best in predicting positive cases of near-term suicidal behavior using this dataset. CONCLUSIONS: ML can be useful in analyzing data from psychometric scales, such as the SCI, and for predicting near-term suicidal behavior. However, in cases such as the current analysis where the data are highly imbalanced, the optimal method of measuring performance must be carefully considered and selected.