Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example

碩士 === 國立政治大學 === 資訊管理學系 === 105 === The rapid development of information technology over the past decades has dramatically increased the amount of online information. Because of the time-wasting on absorbing large amounts of information for users, we would like to present a method in this thesis by...

Full description

Bibliographic Details
Main Author: 林孟儀
Other Authors: 楊建民
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/35qc3k
id ndltd-TW-105NCCU5396016
record_format oai_dc
spelling ndltd-TW-105NCCU53960162019-05-15T23:25:04Z http://ndltd.ncl.edu.tw/handle/35qc3k Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example 應用文本主題與關係探勘於多文件自動摘要方法之研究:以電影評論文章為例 林孟儀 碩士 國立政治大學 資訊管理學系 105 The rapid development of information technology over the past decades has dramatically increased the amount of online information. Because of the time-wasting on absorbing large amounts of information for users, we would like to present a method in this thesis by using text topic and relationship mining for multi-document summarization to help users grasp the theme of multiple documents quickly and easily by reading the accurate summary without reading the whole documents. We use movie reviews as an example of multi-document summarization and apply the concept of article structures to categorize summary into film data, film orientation and conclusion by comparing the thesaurus of movie review field built by this thesis. Then we cluster the paragraphs in the structure of film orientation into different topics by Latent Dirichlet Allocation (LDA). Next, we apply the concept of text relationship map, a network of paragraphs and the node in the network referring to a paragraph and an edge indicating that the corresponding paragraphs are related to each other, to extract the most important paragraph in each topic and order them. Finally, we remove conjunctions and replace pronouns with the name it indicates in each extracted paragraph s and generate a bullet-point summary. From the result, the summary produced by this thesis can cover different topics of contents and improve the diversity of the summary. The similarities compared with the produced summaries and the best-sample summaries raise of 10.8228%, 14.0123% and 25.8142% respectively. The method presented in this thesis grasps the key contents effectively and generates a comprehensive summary. By providing this method, we try to let users aggregate the movie reviews automatically and generate a simplified summary to help them reduce the time in searching and reading articles. 楊建民 學位論文 ; thesis 57 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立政治大學 === 資訊管理學系 === 105 === The rapid development of information technology over the past decades has dramatically increased the amount of online information. Because of the time-wasting on absorbing large amounts of information for users, we would like to present a method in this thesis by using text topic and relationship mining for multi-document summarization to help users grasp the theme of multiple documents quickly and easily by reading the accurate summary without reading the whole documents. We use movie reviews as an example of multi-document summarization and apply the concept of article structures to categorize summary into film data, film orientation and conclusion by comparing the thesaurus of movie review field built by this thesis. Then we cluster the paragraphs in the structure of film orientation into different topics by Latent Dirichlet Allocation (LDA). Next, we apply the concept of text relationship map, a network of paragraphs and the node in the network referring to a paragraph and an edge indicating that the corresponding paragraphs are related to each other, to extract the most important paragraph in each topic and order them. Finally, we remove conjunctions and replace pronouns with the name it indicates in each extracted paragraph s and generate a bullet-point summary. From the result, the summary produced by this thesis can cover different topics of contents and improve the diversity of the summary. The similarities compared with the produced summaries and the best-sample summaries raise of 10.8228%, 14.0123% and 25.8142% respectively. The method presented in this thesis grasps the key contents effectively and generates a comprehensive summary. By providing this method, we try to let users aggregate the movie reviews automatically and generate a simplified summary to help them reduce the time in searching and reading articles.
author2 楊建民
author_facet 楊建民
林孟儀
author 林孟儀
spellingShingle 林孟儀
Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
author_sort 林孟儀
title Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
title_short Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
title_full Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
title_fullStr Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
title_full_unstemmed Application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
title_sort application of text topic and relationship mining for multi-document summarization: using movie reviews as an example
url http://ndltd.ncl.edu.tw/handle/35qc3k
work_keys_str_mv AT línmèngyí applicationoftexttopicandrelationshipminingformultidocumentsummarizationusingmoviereviewsasanexample
AT línmèngyí yīngyòngwénběnzhǔtíyǔguānxìtànkānyúduōwénjiànzìdòngzhāiyàofāngfǎzhīyánjiūyǐdiànyǐngpínglùnwénzhāngwèilì
_version_ 1719148159779733504