Apply Document Clustering to Improve Formal Concept Analysis Performance

碩士 === 國立雲林科技大學 === 資訊管理系碩士班 === 100 === Because of the problem of information overload, how to construct a good filtering mechanism for information retrieval systems is one of the most important issues. FCA is used to construct domain ontology. It can also achieve the purpose of classification...

Full description

Bibliographic Details
Main Authors: Kuan-yu Chen, 陳冠余
Other Authors: Chuen-min Huang
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/15480580512085985812
id ndltd-TW-100YUNT5396059
record_format oai_dc
spelling ndltd-TW-100YUNT53960592015-10-13T21:55:45Z http://ndltd.ncl.edu.tw/handle/15480580512085985812 Apply Document Clustering to Improve Formal Concept Analysis Performance 結合文件分群技術改善正規化概念分析效能 Kuan-yu Chen 陳冠余 碩士 國立雲林科技大學 資訊管理系碩士班 100 Because of the problem of information overload, how to construct a good filtering mechanism for information retrieval systems is one of the most important issues. FCA is used to construct domain ontology. It can also achieve the purpose of classification in searching results based on its classification feature. However, when FCA processes large and broad datasets, it may lead to inefficient implementation by vocabulary confusion and too many attributes. This would produce enormous concept lattice and the system will spend a large amount of time while traversing concept lattice. Therefore, previously studies usually apply FCA in smaller dataset. To minimize such inefficiency, this study applies Single-pass clustering to minimize the Information dimensions before running FCA. After evaluation, we found that Single-pass clustering has a superior performance when its threshold equals 0.4. As a result, compare using FCA without Single-pass clustering, the recall improved from 3% to 37% when we extract 5% to 40% of attributes. Moreover, we found that only 10% of the attributes could reach 70% of recall. The search processing time also progressed from 4 to 15 times. In users’ satisfaction survey, 77% of the users are satisfied with our proposed method. Chuen-min Huang 黃純敏 2012 學位論文 ; thesis 35 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立雲林科技大學 === 資訊管理系碩士班 === 100 === Because of the problem of information overload, how to construct a good filtering mechanism for information retrieval systems is one of the most important issues. FCA is used to construct domain ontology. It can also achieve the purpose of classification in searching results based on its classification feature. However, when FCA processes large and broad datasets, it may lead to inefficient implementation by vocabulary confusion and too many attributes. This would produce enormous concept lattice and the system will spend a large amount of time while traversing concept lattice. Therefore, previously studies usually apply FCA in smaller dataset. To minimize such inefficiency, this study applies Single-pass clustering to minimize the Information dimensions before running FCA. After evaluation, we found that Single-pass clustering has a superior performance when its threshold equals 0.4. As a result, compare using FCA without Single-pass clustering, the recall improved from 3% to 37% when we extract 5% to 40% of attributes. Moreover, we found that only 10% of the attributes could reach 70% of recall. The search processing time also progressed from 4 to 15 times. In users’ satisfaction survey, 77% of the users are satisfied with our proposed method.
author2 Chuen-min Huang
author_facet Chuen-min Huang
Kuan-yu Chen
陳冠余
author Kuan-yu Chen
陳冠余
spellingShingle Kuan-yu Chen
陳冠余
Apply Document Clustering to Improve Formal Concept Analysis Performance
author_sort Kuan-yu Chen
title Apply Document Clustering to Improve Formal Concept Analysis Performance
title_short Apply Document Clustering to Improve Formal Concept Analysis Performance
title_full Apply Document Clustering to Improve Formal Concept Analysis Performance
title_fullStr Apply Document Clustering to Improve Formal Concept Analysis Performance
title_full_unstemmed Apply Document Clustering to Improve Formal Concept Analysis Performance
title_sort apply document clustering to improve formal concept analysis performance
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/15480580512085985812
work_keys_str_mv AT kuanyuchen applydocumentclusteringtoimproveformalconceptanalysisperformance
AT chénguānyú applydocumentclusteringtoimproveformalconceptanalysisperformance
AT kuanyuchen jiéhéwénjiànfēnqúnjìshùgǎishànzhèngguīhuàgàiniànfēnxīxiàonéng
AT chénguānyú jiéhéwénjiànfēnqúnjìshùgǎishànzhèngguīhuàgàiniànfēnxīxiàonéng
_version_ 1718070744720605184