Summary: | 碩士 === 國立臺灣師範大學 === 資訊工程學系 === 102 === The goal of this thesis is to automatically suggest query keywords from the search results returned by the search engine in order to further filter the large amount of search results by using these query keywords as the specialized queries. A two-level query suggestion method, called the M_PhRank, is proposed. The first level suggestion aims to provide the query terms, which can cover search results as many as possible, and the query terms in the second level should have clear meaning and lower overlap between their covered objects. Firstly, the coverage over search results is computed as the novelty score of a word, which is used to select the topic terms in the first level suggestion. Secondly, the semantic scores of words are estimated by using the random walk algorithm on the co-occurrence graph of words. The query keywords consisting of 2-3 non-topic terms form the candidate subtopic terms, whose semantic scores are computed according to the semantic scores of their composing words. According to the given suggestion number, the number of subtopic terms under the topic-terms is decided proportional to the coverage of the topic terms. Finally, the hierarchical query suggestion structure is constructed by the topic terms in first level and their corresponding subtopic terms on the second level. The empirical experiment results show that the M_PhRank method performs better than the baseline method on providing more semantics specific terms and high coverage with limited overlap increasing. Moreover, according to user survey, the hierarchy of query keyword suggestions constructed by M_PhRank gets high satisfaction on query assistance.
|