Conceptual Text Mining with Hierarchical Knowledge Structures
博士 === 國立成功大學 === 資訊管理研究所 === 101 === Text mining is a critical technique to manage huge collections of documents. However, most existing text mining algorithms are easily affected by ambiguous terms. The ability to disambiguate for a classifier is thus as important as the ability to classify accura...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/64633430782280228010 |
id |
ndltd-TW-101NCKU5396001 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101NCKU53960012015-10-13T22:01:27Z http://ndltd.ncl.edu.tw/handle/64633430782280228010 Conceptual Text Mining with Hierarchical Knowledge Structures 結合階層式知識結構之文本分析 Fu-ChingTsai 蔡馥璟 博士 國立成功大學 資訊管理研究所 101 Text mining is a critical technique to manage huge collections of documents. However, most existing text mining algorithms are easily affected by ambiguous terms. The ability to disambiguate for a classifier is thus as important as the ability to classify accurately. Knowledge structure (KS) has proven to be efficient in discovering the hidden structural relations and implications of knowledge, thus significant reasoning patterns are retrieved to enhance the efficiency of text analysis. In this research, we proposed a conceptual text mining framework based on two hierarchical KS model, lattice and tree, to discover the efficiency of incorporating hierarchical KS for retrieving context from corpus in text mining tasks. The first model is based on fuzzy formal concept analysis to conceptualize documents into a more abstract form of concepts, and use these as the training examples to alleviate the arbitrary outcomes caused by ambiguous terms. The proposed model is evaluated on a benchmark testbed and two opinion polarity datasets. The experimental results indicate superior performance in all datasets. Applying concept analysis to opinion polarity classification is a leading endeavor in the disambiguation of Web 2.0 contents, and the approach presented in this paper offers significant improvements on current methods. The results of the proposed model reveal its ability to decrease the sensitivity to noise, as well as its adaptability in cross domain applications. However, the lattice-based model is suffered from highly computational complexity so as to limited in dealing with big data. To address this critical issue, we propose a new approach to construct a tree-based KS from corpus which can reveal the significant relations among knowledge objects and provide concise entity relations to avoid computation overload. The effectiveness of the second model is demonstrated with two representative public data sets. The evaluation results show that the method presented in this work achieves remarkable consistency with the domain-specific knowledge structure, and is capable of reflecting appropriate similarities among knowledge objects along with hierarchical implications in the document classification task. Sheng-Tun Li 李昇暾 2013 學位論文 ; thesis 55 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立成功大學 === 資訊管理研究所 === 101 === Text mining is a critical technique to manage huge collections of documents. However, most existing text mining algorithms are easily affected by ambiguous terms. The ability to disambiguate for a classifier is thus as important as the ability to classify accurately. Knowledge structure (KS) has proven to be efficient in discovering the hidden structural relations and implications of knowledge, thus significant reasoning patterns are retrieved to enhance the efficiency of text analysis. In this research, we proposed a conceptual text mining framework based on two hierarchical KS model, lattice and tree, to discover the efficiency of incorporating hierarchical KS for retrieving context from corpus in text mining tasks.
The first model is based on fuzzy formal concept analysis to conceptualize documents into a more abstract form of concepts, and use these as the training examples to alleviate the arbitrary outcomes caused by ambiguous terms. The proposed model is evaluated on a benchmark testbed and two opinion polarity datasets. The experimental results indicate superior performance in all datasets. Applying concept analysis to opinion polarity classification is a leading endeavor in the disambiguation of Web 2.0 contents, and the approach presented in this paper offers significant improvements on current methods. The results of the proposed model reveal its ability to decrease the sensitivity to noise, as well as its adaptability in cross domain applications. However, the lattice-based model is suffered from highly computational complexity so as to limited in dealing with big data. To address this critical issue, we propose a new approach to construct a tree-based KS from corpus which can reveal the significant relations among knowledge objects and provide concise entity relations to avoid computation overload. The effectiveness of the second model is demonstrated with two representative public data sets. The evaluation results show that the method presented in this work achieves remarkable consistency with the domain-specific knowledge structure, and is capable of reflecting appropriate similarities among knowledge objects along with hierarchical implications in the document classification task.
|
author2 |
Sheng-Tun Li |
author_facet |
Sheng-Tun Li Fu-ChingTsai 蔡馥璟 |
author |
Fu-ChingTsai 蔡馥璟 |
spellingShingle |
Fu-ChingTsai 蔡馥璟 Conceptual Text Mining with Hierarchical Knowledge Structures |
author_sort |
Fu-ChingTsai |
title |
Conceptual Text Mining with Hierarchical Knowledge Structures |
title_short |
Conceptual Text Mining with Hierarchical Knowledge Structures |
title_full |
Conceptual Text Mining with Hierarchical Knowledge Structures |
title_fullStr |
Conceptual Text Mining with Hierarchical Knowledge Structures |
title_full_unstemmed |
Conceptual Text Mining with Hierarchical Knowledge Structures |
title_sort |
conceptual text mining with hierarchical knowledge structures |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/64633430782280228010 |
work_keys_str_mv |
AT fuchingtsai conceptualtextminingwithhierarchicalknowledgestructures AT càifùjǐng conceptualtextminingwithhierarchicalknowledgestructures AT fuchingtsai jiéhéjiēcéngshìzhīshíjiégòuzhīwénběnfēnxī AT càifùjǐng jiéhéjiēcéngshìzhīshíjiégòuzhīwénběnfēnxī |
_version_ |
1718072397657014272 |