Integration of Ontology and Semantic Similarity for extracting Keywords from Documents

碩士 === 中原大學 === 資訊工程研究所 === 102 === A document may have a large number of words, but it can have only some keywords which describe content of the document. According to these keywords, we can also distinguish the type of the document. Then, these keywords need a sequence of extracting method to get...

Full description

Bibliographic Details
Main Authors:	Yuan-Lin Chen, 陳宛琳
Other Authors:	Chung-Shyan Liu
Format:	Others
Language:	zh-TW
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/46942011593856699312

id	ndltd-TW-102CYCU5392030
record_format	oai_dc
spelling	ndltd-TW-102CYCU53920302015-10-13T23:49:49Z http://ndltd.ncl.edu.tw/handle/46942011593856699312 Integration of Ontology and Semantic Similarity for extracting Keywords from Documents 結合本體論與語意相似程度對文件萃取關鍵字 Yuan-Lin Chen 陳宛琳碩士中原大學資訊工程研究所 102 A document may have a large number of words, but it can have only some keywords which describe content of the document. According to these keywords, we can also distinguish the type of the document. Then, these keywords need a sequence of extracting method to get them. In this thesis, an approach to extracting keywords from documents by combing knowledge in Ontology and sematic similarity was presented. We can find all knowledge which is described of words by Ontology, and then select more suitable knowledge through the calculation method of sematic similarity. By this collocation, we can find keywords from documents. First, we use Lucene, which is a tool for full-text search, to get words from the content of the document and to remove stop words. A two stage Stemming method is used to stem words to their root forms. The words are tagged using POS Tagger. The meaning of the words are obtained by searching the computed using Lin's sematic similarity. Finally, a subset of keywords are selected by using the domain Ontology information. Chung-Shyan Liu 留忠賢 2014 學位論文 ; thesis 86 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 中原大學 === 資訊工程研究所 === 102 === A document may have a large number of words, but it can have only some keywords which describe content of the document. According to these keywords, we can also distinguish the type of the document. Then, these keywords need a sequence of extracting method to get them. In this thesis, an approach to extracting keywords from documents by combing knowledge in Ontology and sematic similarity was presented. We can find all knowledge which is described of words by Ontology, and then select more suitable knowledge through the calculation method of sematic similarity. By this collocation, we can find keywords from documents. First, we use Lucene, which is a tool for full-text search, to get words from the content of the document and to remove stop words. A two stage Stemming method is used to stem words to their root forms. The words are tagged using POS Tagger. The meaning of the words are obtained by searching the computed using Lin's sematic similarity. Finally, a subset of keywords are selected by using the domain Ontology information.
author2	Chung-Shyan Liu
author_facet	Chung-Shyan Liu Yuan-Lin Chen 陳宛琳
author	Yuan-Lin Chen 陳宛琳
spellingShingle	Yuan-Lin Chen 陳宛琳 Integration of Ontology and Semantic Similarity for extracting Keywords from Documents
author_sort	Yuan-Lin Chen
title	Integration of Ontology and Semantic Similarity for extracting Keywords from Documents
title_short	Integration of Ontology and Semantic Similarity for extracting Keywords from Documents
title_full	Integration of Ontology and Semantic Similarity for extracting Keywords from Documents
title_fullStr	Integration of Ontology and Semantic Similarity for extracting Keywords from Documents
title_full_unstemmed	Integration of Ontology and Semantic Similarity for extracting Keywords from Documents
title_sort	integration of ontology and semantic similarity for extracting keywords from documents
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/46942011593856699312
work_keys_str_mv	AT yuanlinchen integrationofontologyandsemanticsimilarityforextractingkeywordsfromdocuments AT chénwǎnlín integrationofontologyandsemanticsimilarityforextractingkeywordsfromdocuments AT yuanlinchen jiéhéběntǐlùnyǔyǔyìxiāngshìchéngdùduìwénjiàncuìqǔguānjiànzì AT chénwǎnlín jiéhéběntǐlùnyǔyǔyìxiāngshìchéngdùduìwénjiàncuìqǔguānjiànzì
_version_	1718086931849412608

Integration of Ontology and Semantic Similarity for extracting Keywords from Documents

Similar Items