On the abstraction and presentation of multi-source knowledge
碩士 === 長榮大學 === 資訊管理學系碩士班 === 97 === With the advent of the informationalization and the prevalence of the Internet, information is more accessible than years past. It is becoming crucial to select suitable content within the massive amount of material, thus the technique of document abstraction has...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/09151249521505393950 |
id |
ndltd-TW-097CJU00396001 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097CJU003960012015-10-13T13:11:49Z http://ndltd.ncl.edu.tw/handle/09151249521505393950 On the abstraction and presentation of multi-source knowledge 網路多文件摘要整合及呈現 CHAN YUEH CHIN 詹岳縉 碩士 長榮大學 資訊管理學系碩士班 97 With the advent of the informationalization and the prevalence of the Internet, information is more accessible than years past. It is becoming crucial to select suitable content within the massive amount of material, thus the technique of document abstraction has became important since it is able to extract usable information from massive data. This thesis proposes the process for the abstraction of specific webs on Internet and presents the result to the users. The major tasks of the proposed system consist of 3 blocks. First, collect the original corpus respectively based on the distinctive contents within relevant webs. Then, deal with the original corpus by means of Computational Linguistics, including the methods of word segmentation and tagging. Third, employ the similarity measurement between paragraphs and sentences to form a category including analogical expressions among the topic paragraphs. At last, extract the keywords within categories, in which certain sentences that contain most keywords would become the results of abstraction. The results points out that the abstraction system achieves 89-90% satisfaction score in evaluation of the readable, fluency, comprehensive and non-redundant information extraction. As a consequence, it signifies that the abstraction system is acceptable for users. Keywords: web abstraction, multi document avstraction, similarity measurement. 王献章 2009 學位論文 ; thesis 54 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 長榮大學 === 資訊管理學系碩士班 === 97 === With the advent of the informationalization and the prevalence of the Internet, information is more accessible than years past. It is becoming crucial to select suitable content within the massive amount of material, thus the technique of document abstraction has became important since it is able to extract usable information from massive data.
This thesis proposes the process for the abstraction of specific webs on Internet and presents the result to the users. The major tasks of the proposed system consist of 3 blocks. First, collect the original corpus respectively based on the distinctive contents within relevant webs. Then, deal with the original corpus by means of Computational Linguistics, including the methods of word segmentation and tagging. Third, employ the similarity measurement between paragraphs and sentences to form a category including analogical expressions among the topic paragraphs. At last, extract the keywords within categories, in which certain sentences that contain most keywords would become the results of abstraction.
The results points out that the abstraction system achieves 89-90% satisfaction score in evaluation of the readable, fluency, comprehensive and non-redundant information extraction. As a consequence, it signifies that the abstraction system is acceptable for users.
Keywords: web abstraction, multi document avstraction, similarity measurement.
|
author2 |
王献章 |
author_facet |
王献章 CHAN YUEH CHIN 詹岳縉 |
author |
CHAN YUEH CHIN 詹岳縉 |
spellingShingle |
CHAN YUEH CHIN 詹岳縉 On the abstraction and presentation of multi-source knowledge |
author_sort |
CHAN YUEH CHIN |
title |
On the abstraction and presentation of multi-source knowledge |
title_short |
On the abstraction and presentation of multi-source knowledge |
title_full |
On the abstraction and presentation of multi-source knowledge |
title_fullStr |
On the abstraction and presentation of multi-source knowledge |
title_full_unstemmed |
On the abstraction and presentation of multi-source knowledge |
title_sort |
on the abstraction and presentation of multi-source knowledge |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/09151249521505393950 |
work_keys_str_mv |
AT chanyuehchin ontheabstractionandpresentationofmultisourceknowledge AT zhānyuèjìn ontheabstractionandpresentationofmultisourceknowledge AT chanyuehchin wǎnglùduōwénjiànzhāiyàozhěnghéjíchéngxiàn AT zhānyuèjìn wǎnglùduōwénjiànzhāiyàozhěnghéjíchéngxiàn |
_version_ |
1717733158051381248 |