Summary: | 碩士 === 國立清華大學 === 資訊工程學系 === 89 === Most summarization systems are designed for a single document at present. These systems indicate the essence of individual document, but do not transfer similar documents into single summary. Can we develop a multi-document summarization system, which transfers related documents with the same event into a summary? If that is possible, the main points of documents will be clearly and simply displayed with two or three sentences. Users can see whether these documents are what they want in a minute. It can reduce time for collecting documents and enable users to gather information on the Internet more efficiently.
To develop a multi-document summarization system is the goal of this thesis. Summary produced by the must system satisfy two conditions: indicative and topic related. The summary should be tailored to suit user’s query.
To achieve this goal, we will study the indicativeness and topic relevance of sentences, and the selection of sentences that are important and independence to each other. Finally, unimportant small clauses will be deleted, to make the final summary more concise.
System generates summaries with 248 documents and fifty topics of NTCIR. The reduction rate is over 95%. overall, the quality of summaries produced were satisfactory.
|