Summary: | 碩士 === 元智大學 === 資訊工程學系 === 99 === Nowadays Web search engines play an important role in helping people effectively find
information from massive Web data. The Web query classification (WQC) problem is a
crucial issue in search engine technology. The task of WQC is to classify Web queries
into relevant Web categories. For the WQC problem, there are two major difficulties.
First, most queries are short and ambiguous. Second, many queries have more than one
user intention. Therefore, this research proposes a scheme that exploits multiple search
engines to enrich user queries, and then extracts multiple latent topics from the expanded
queries.The scheme uses the Latent Dirichlet Allocation (LDA) model to extract the latent
topics from the enriched queries for query classification. The experiments show that our
approach can improve the performance by 6.5% and 6.6% for precision and F1, respectively
in comparison with the schemes proposed by Shen et al. in 2005. The experimental
results show that the proposed LDA-based scheme can effectively improve the WQC performance.
|