Cluster-based information retrieval modeling

Cluster-based information retrieval, an extension of information retrieval strategy, is based on the assumption that a document collection can be organized into a set of topics so that a user can enhance retrieval effectiveness. The cluster-based IR model assumes that queries can be associated with...

Full description

Bibliographic Details
Main Author: Sze, Richard
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/15435
Description
Summary:Cluster-based information retrieval, an extension of information retrieval strategy, is based on the assumption that a document collection can be organized into a set of topics so that a user can enhance retrieval effectiveness. The cluster-based IR model assumes that queries can be associated with clusters that contain high concentrations of relevant documents, and that such association can lead to gains in retrieval effectiveness. Earlier studies, however, have provided negative to mixed results for the performance of the model. Moreover, studies are lacking which investigate the potential of the model in situations where queries are manually associated with the appropriate clusters. The goal of this thesis is to provide evidence for the validity of the cluster-base IR model's effectiveness through conducting extensive empirical studies which explore alternative schemes of the model on a large scale and according to a well-accepted benchmark. Investigation shows that the cluster-based IR model has the potential to enhance retrieval effectiveness, and yet, alternative techniques fail to actually achieve enhanced effectiveness.