Your browser doesn't support javascript.
loading
Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval / 대한의료정보학회지
Article de En | WPRIM | ID: wpr-175292
Bibliothèque responsable: WPRO
ABSTRACT
OBJECTIVES: The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. METHODS: A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on pseudo-relevance feedback. The OHSUMED test collection, which is a subset of the MEDLINE database, was used as a test corpus. Various ranking algorithms were tested in combination with different term re-weighting algorithms. RESULTS: Our comprehensive evaluation showed that the local context analysis ranking algorithm, when used in combination with one of the reweighting algorithms - Rocchio, the probabilistic model, and our variants - significantly outperformed other algorithm combinations by up to 12% (paired t-test; p < 0.05). In a pseudo-relevance feedback framework, effective query expansion would be achieved by the careful consideration of term ranking and re-weighting algorithm pairs, at least in the context of the OHSUMED corpus. CONCLUSIONS: Comparative experiments on term ranking algorithms were performed in the context of a subset of MEDLINE documents. With medical documents, local context analysis, which uses co-occurrence with all query terms, significantly outperformed various term ranking methods based on both frequency and distribution analyses. Furthermore, the results of the experiments demonstrated that the term rank-based re-weighting method contributed to a remarkable improvement in mean average precision.
Sujet(s)
Mots clés
Texte intégral: 1 Indice: WPRIM Sujet Principal: Modèles statistiques / Mémorisation et recherche des informations Type d'étude: Risk_factors_studies langue: En Texte intégral: Healthcare Informatics Research Année: 2011 Type: Article
Texte intégral: 1 Indice: WPRIM Sujet Principal: Modèles statistiques / Mémorisation et recherche des informations Type d'étude: Risk_factors_studies langue: En Texte intégral: Healthcare Informatics Research Année: 2011 Type: Article