Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
碩士 === 國立政治大學 === 資訊管理學系 === 105 === The rapid development of information technology over the past decades has dramatically increased the amount of online information. Because of the time-wasting on absorbing large amounts of information for users, we would like to present a method in this thesis by...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Online Access: | http://ndltd.ncl.edu.tw/handle/35qc3k |
id |
ndltd-TW-105NCCU5396016 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105NCCU53960162019-05-15T23:25:04Z http://ndltd.ncl.edu.tw/handle/35qc3k Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example 應用文本主題與關係探勘於多文件自動摘要方法之研究:以電影評論文章為例 林孟儀 碩士 國立政治大學 資訊管理學系 105 The rapid development of information technology over the past decades has dramatically increased the amount of online information. Because of the time-wasting on absorbing large amounts of information for users, we would like to present a method in this thesis by using text topic and relationship mining for multi-document summarization to help users grasp the theme of multiple documents quickly and easily by reading the accurate summary without reading the whole documents. We use movie reviews as an example of multi-document summarization and apply the concept of article structures to categorize summary into film data, film orientation and conclusion by comparing the thesaurus of movie review field built by this thesis. Then we cluster the paragraphs in the structure of film orientation into different topics by Latent Dirichlet Allocation (LDA). Next, we apply the concept of text relationship map, a network of paragraphs and the node in the network referring to a paragraph and an edge indicating that the corresponding paragraphs are related to each other, to extract the most important paragraph in each topic and order them. Finally, we remove conjunctions and replace pronouns with the name it indicates in each extracted paragraph s and generate a bullet-point summary. From the result, the summary produced by this thesis can cover different topics of contents and improve the diversity of the summary. The similarities compared with the produced summaries and the best-sample summaries raise of 10.8228%, 14.0123% and 25.8142% respectively. The method presented in this thesis grasps the key contents effectively and generates a comprehensive summary. By providing this method, we try to let users aggregate the movie reviews automatically and generate a simplified summary to help them reduce the time in searching and reading articles. 楊建民 學位論文 ; thesis 57 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立政治大學 === 資訊管理學系 === 105 === The rapid development of information technology over the past decades has dramatically increased the amount of online information. Because of the time-wasting on absorbing large amounts of information for users, we would like to present a method in this thesis by using text topic and relationship mining for multi-document summarization to help users grasp the theme of multiple documents quickly and easily by reading the accurate summary without reading the whole documents.
We use movie reviews as an example of multi-document summarization and apply the concept of article structures to categorize summary into film data, film orientation and conclusion by comparing the thesaurus of movie review field built by this thesis. Then we cluster the paragraphs in the structure of film orientation into different topics by Latent Dirichlet Allocation (LDA). Next, we apply the concept of text relationship map, a network of paragraphs and the node in the network referring to a paragraph and an edge indicating that the corresponding paragraphs are related to each other, to extract the most important paragraph in each topic and order them. Finally, we remove conjunctions and replace pronouns with the name it indicates in each extracted paragraph s and generate a bullet-point summary.
From the result, the summary produced by this thesis can cover different topics of contents and improve the diversity of the summary. The similarities compared with the produced summaries and the best-sample summaries raise of 10.8228%, 14.0123% and 25.8142% respectively. The method presented in this thesis grasps the key contents effectively and generates a comprehensive summary. By providing this method, we try to let users aggregate the movie reviews automatically and generate a simplified summary to help them reduce the time in searching and reading articles.
|
author2 |
楊建民 |
author_facet |
楊建民 林孟儀 |
author |
林孟儀 |
spellingShingle |
林孟儀 Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
author_sort |
林孟儀 |
title |
Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
title_short |
Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
title_full |
Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
title_fullStr |
Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
title_full_unstemmed |
Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
title_sort |
application of text topic and relationship mining for multi-document summarization: using movie reviews as an example |
url |
http://ndltd.ncl.edu.tw/handle/35qc3k |
work_keys_str_mv |
AT línmèngyí applicationoftexttopicandrelationshipminingformultidocumentsummarizationusingmoviereviewsasanexample AT línmèngyí yīngyòngwénběnzhǔtíyǔguānxìtànkānyúduōwénjiànzìdòngzhāiyàofāngfǎzhīyánjiūyǐdiànyǐngpínglùnwénzhāngwèilì |
_version_ |
1719148159779733504 |