Aggregating Multi-Resources to Improve the Diversity of Search Engine Result Pages

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 102 === Previous work on snippet generation focused mainly on how to produce one snippet for an individual search result. This paper aims to generate snippets as a comprehensive overview for an entity query (e.g., flu) in a search-result page. Our approach first extra...

Full description

Bibliographic Details
Main Authors: Chung-Lun Chiang, 江忠倫
Other Authors: 鄭卜壬
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/35791180022806804451
Description
Summary:碩士 === 國立臺灣大學 === 資訊工程學研究所 === 102 === Previous work on snippet generation focused mainly on how to produce one snippet for an individual search result. This paper aims to generate snippets as a comprehensive overview for an entity query (e.g., flu) in a search-result page. Our approach first extracts the attributes (e.g., symptom and diagnose) of the categories (e.g., disease) from multi-resources including a community-based question-answering (CQA) website, an online encyclopedia website and suggestions from a commercial search engine. Then, we generate the snippets based on how central a sentence is to the query, its category, and how well it diversifies the attributes from multi-resources. Integer Linear Programming (ILP) is adopted to find the optimal sentence set. After finding the initial set of sentences, we further improve the result by aggregate the search-result page(SERP) of the query''s suggestion words. The experiments are conducted on Wikipedia, Yahoo! Answers, Google Search. Experimental results demonstrate the effectiveness of our approach, compared to an existing commercial search engine and several summarization baselines.