ABSTRACT
BACKGROUND: For community-dwelling elderly individuals without enough clinical data, it is important to develop a method to predict their dementia risk and identify risk factors for the formulation of reasonable public health policies to prevent dementia. OBJECTIVE: A community elderly survey data was used to establish machine learning prediction models for dementia and analyze the risk factors. METHODS: In a cluster-sample community survey of 9,387 elderly people in 5 subdistricts of Wuxi City, data on sociodemographics and neuropsychological self-rating scales for depression, anxiety, and cognition evaluation were collected. Machine learning models were developed to predict their dementia risk and identify risk factors. RESULTS: The random forest model (AUCâ=â0.686) had slightly better dementia prediction performance than logistic regression model (AUCâ=â0.677) and neural network model (AUCâ=â0.664). The sociodemographic data and psychological evaluation revealed that depression (ORâ=â3.933, 95% CIâ=â2.995-5.166); anxiety (ORâ=â2.352, 95% CIâ=â1.577-3.509); multiple physical diseases (ORâ=â2.486, 95% CIâ=â1.882-3.284 for three or above); "disability, poverty or no family member" (ORâ=â1.859, 95% CIâ=â1.337-2.585) and "empty nester" (ORâ=â1.339, 95% CIâ=â1.125-1.595) in special family status; "no spouse now" (ORâ=â1.567, 95% CIâ=â1.118-2.197); age older than 80 years (ORâ=â1.645, 95% CIâ=â1.335-2.026); and female (ORâ=â1.214, 95% CIâ=â1.048-1.405) were risk factors for suspected dementia, while a higher education level (ORâ=â0.365, 95% CIâ=â0.245-0.546 for college or above) was a protective factor. CONCLUSION: The machine learning models using sociodemographic and psychological evaluation data from community surveys can be used as references for the prevention and control of dementia in large-scale community populations and the formulation of public health policies.