Summary: | 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 102 === Previous work on snippet generation focused mainly on how to produce one snippet for an individual search result. This paper aims to generate snippets as a comprehensive overview for an entity query (e.g., flu) in a search-result page.
Our approach first extracts the attributes (e.g., symptom and diagnose) of the categories (e.g., disease) from multi-resources including a community-based question-answering (CQA) website, an online encyclopedia website and suggestions from a commercial search engine. Then, we generate the snippets based on how central a sentence is to the query, its category, and how well it diversifies the attributes from multi-resources. Integer Linear Programming (ILP) is adopted to find the optimal sentence set. After finding the initial set of sentences, we further improve the result by aggregate the search-result page(SERP) of the query''s suggestion words.
The experiments are conducted on Wikipedia, Yahoo! Answers, Google Search. Experimental results demonstrate the effectiveness of our approach, compared to an existing commercial search engine and several summarization baselines.
|