Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add more filters










Database
Language
Publication year range
1.
Stat Methods Med Res ; 30(7): 1708-1724, 2021 07.
Article in English | MEDLINE | ID: mdl-34074165

ABSTRACT

There is a well-established tradition within the statistics literature that explores different techniques for reducing the dimensionality of large feature spaces. The problem is central to machine learning and it has been largely explored under the unsupervised learning paradigm. We introduce a supervised clustering methodology that capitalizes on a Metropolis Hastings algorithm to optimize the partition structure of a large categorical feature space tailored towards minimizing the test error of a learning algorithm. This is a general methodology that can be applied to any supervised learning problem with a large categorical feature space. We show the benefits of the algorithm by applying this methodology to the problem of risk adjustment in competitive health insurance markets. We use a large claims data set that records ICD-10 codes, a large categorical feature space. We aim at improving risk adjustment by clustering diagnostic codes into risk groups suitable for health expenditure prediction. We test the performance of our methodology against common alternatives using panel data from a representative sample of twenty three million citizens in Colombian Healthcare System. Our results outperform common alternatives and suggest that it has potential to improve risk adjustment.


Subject(s)
Algorithms , Machine Learning , Cluster Analysis
SELECTION OF CITATIONS
SEARCH DETAIL
...