[Using principal component analysis to increase accuracy of prediction of metabolic syndrome in artificial neural network and logistic regression models]
Journal of Shahrekord University of Medical Sciences. 2011; 13 (4): 18-27
em Fa
| IMEMR
| ID: emr-194655
Biblioteca responsável:
EMRO
Background and aims: In modeling process, correlation between covariates causes multicolinearity that may reduce efficiency of the model. This study was aimed to use principal component analysis to eliminate the effect of multicolinearity in logistic regression and neural network models, and to determine its effect on the accuracy of predicting metabolic syndrome in a sample of individuals participating in the Tehran Lipid and Glucose Study
Methods: A total of 347 participants from the Cohort section of the Tehran Lipid and Glucose Study [TLGS] were evaluated. The subjects were free of metabolic syndrome, according to the ATPIII criteria, at the beginning. Logistic regression, logistic regression with principal components, neural network and neural network with principal components models were fitted to the data. The ability of the models in predicting metabolic syndrome was compared using ROC analysis and kappa statistics
Results: The area under receiver operating characteristic [ROC] curve for logistic regression, logistic regression with principal components, neural network and neural network with principal component were estimated as 0.749, 0.790, 0.890 and 0.927 respectively. Sensitivity of the models was calculated as 0.483, 0.435, 0.836 and 0.919 and their specificity as 0.857, 0.919, 0.892 and 0.964 respectively. The kappa statistic for these models was 0.322, 0.386, 0.712 and 0.886 respectively
Conclusion: the study shows that the prediction accuracy of models based on principal components is better than that of models based on primary covariates, so in the presence of multicolinearity, models based on principal components are efficient for predicting metabolic syndrome
Methods: A total of 347 participants from the Cohort section of the Tehran Lipid and Glucose Study [TLGS] were evaluated. The subjects were free of metabolic syndrome, according to the ATPIII criteria, at the beginning. Logistic regression, logistic regression with principal components, neural network and neural network with principal components models were fitted to the data. The ability of the models in predicting metabolic syndrome was compared using ROC analysis and kappa statistics
Results: The area under receiver operating characteristic [ROC] curve for logistic regression, logistic regression with principal components, neural network and neural network with principal component were estimated as 0.749, 0.790, 0.890 and 0.927 respectively. Sensitivity of the models was calculated as 0.483, 0.435, 0.836 and 0.919 and their specificity as 0.857, 0.919, 0.892 and 0.964 respectively. The kappa statistic for these models was 0.322, 0.386, 0.712 and 0.886 respectively
Conclusion: the study shows that the prediction accuracy of models based on principal components is better than that of models based on primary covariates, so in the presence of multicolinearity, models based on principal components are efficient for predicting metabolic syndrome
Buscar no Google
Índice:
IMEMR
Tipo de estudo:
Prognostic_studies
Idioma:
Fa
Revista:
J. Shahrekord Univ. Med. Sci.
Ano de publicação:
2011