Implementataion and Performance Evaluation of Association Rule Mining Algorithms

碩士 === 國立臺灣科技大學 === 資訊管理系 === 90 === Mining of association rules is a popular research area in data mining. Many mining algorithms for association rules have been proposed in the recent years. Every author of the mining algorithm claims that his algorithm is the best under some specific conditions....

Full description

Bibliographic Details
Main Authors:	Shih-Chun Chiu, 邱士軍
Other Authors:	Yungho Leu
Format:	Others
Language:	zh-TW
Published:	2002
Online Access:	http://ndltd.ncl.edu.tw/handle/77869041005439613437

id	ndltd-TW-090NTUST396019
record_format	oai_dc
spelling	ndltd-TW-090NTUST3960192015-10-13T14:41:23Z http://ndltd.ncl.edu.tw/handle/77869041005439613437 Implementataion and Performance Evaluation of Association Rule Mining Algorithms 關聯規則演算法之實作和效能評估 Shih-Chun Chiu 邱士軍碩士國立臺灣科技大學資訊管理系 90 Mining of association rules is a popular research area in data mining. Many mining algorithms for association rules have been proposed in the recent years. Every author of the mining algorithm claims that his algorithm is the best under some specific conditions. A fair comparison serves as a guide for choosing the right mining algorithm for a given specific condition. Unfortunately, no fair third party has conducted comprehensive comparisons among the association rule mining algorithms. In this thesis, we perform performance comparisons on five well-known algorithms. Among them are Apriori, Boolean, FP-Growth, Maxminer and DIC algorithms. We implemented several versions for each algorithm. Then, we choose the most efficient implementation among our implementations and the implementation provided directly by the original author if one is available. We also describe the details of our implementations. To compare the performance of the algorithms, we use three synthetic transactional databases generated by the IBM dataset generator and the FoodMart database, a real transactional database from SQL Server. The three synthetic databases are T5I2, T10I4 and T20I6. They have different mean transaction length and mean frequent itemsets length. Experiments show that no algorithm prevails in all circumstances. The Apriori algorithm and the DIC algorithm prevail when the minimum support is high and, therefore, less computation time is needed. On the other hand, the Boolean algorithm and the FP-Growth algorithm scale up well in the sense that they prevail under low minimum support. Furthermore, the Boolean algorithm and the FP-Growth algorithm significantly outperform other algorithms when the mean transaction length is long. Besides, we also found that the memory size occupied by the FP-tree is at least as large as the transactional database itself. Yungho Leu 呂永和 2002 學位論文 ; thesis 62 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺灣科技大學 === 資訊管理系 === 90 === Mining of association rules is a popular research area in data mining. Many mining algorithms for association rules have been proposed in the recent years. Every author of the mining algorithm claims that his algorithm is the best under some specific conditions. A fair comparison serves as a guide for choosing the right mining algorithm for a given specific condition. Unfortunately, no fair third party has conducted comprehensive comparisons among the association rule mining algorithms. In this thesis, we perform performance comparisons on five well-known algorithms. Among them are Apriori, Boolean, FP-Growth, Maxminer and DIC algorithms. We implemented several versions for each algorithm. Then, we choose the most efficient implementation among our implementations and the implementation provided directly by the original author if one is available. We also describe the details of our implementations. To compare the performance of the algorithms, we use three synthetic transactional databases generated by the IBM dataset generator and the FoodMart database, a real transactional database from SQL Server. The three synthetic databases are T5I2, T10I4 and T20I6. They have different mean transaction length and mean frequent itemsets length. Experiments show that no algorithm prevails in all circumstances. The Apriori algorithm and the DIC algorithm prevail when the minimum support is high and, therefore, less computation time is needed. On the other hand, the Boolean algorithm and the FP-Growth algorithm scale up well in the sense that they prevail under low minimum support. Furthermore, the Boolean algorithm and the FP-Growth algorithm significantly outperform other algorithms when the mean transaction length is long. Besides, we also found that the memory size occupied by the FP-tree is at least as large as the transactional database itself.
author2	Yungho Leu
author_facet	Yungho Leu Shih-Chun Chiu 邱士軍
author	Shih-Chun Chiu 邱士軍
spellingShingle	Shih-Chun Chiu 邱士軍 Implementataion and Performance Evaluation of Association Rule Mining Algorithms
author_sort	Shih-Chun Chiu
title	Implementataion and Performance Evaluation of Association Rule Mining Algorithms
title_short	Implementataion and Performance Evaluation of Association Rule Mining Algorithms
title_full	Implementataion and Performance Evaluation of Association Rule Mining Algorithms
title_fullStr	Implementataion and Performance Evaluation of Association Rule Mining Algorithms
title_full_unstemmed	Implementataion and Performance Evaluation of Association Rule Mining Algorithms
title_sort	implementataion and performance evaluation of association rule mining algorithms
publishDate	2002
url	http://ndltd.ncl.edu.tw/handle/77869041005439613437
work_keys_str_mv	AT shihchunchiu implementataionandperformanceevaluationofassociationruleminingalgorithms AT qiūshìjūn implementataionandperformanceevaluationofassociationruleminingalgorithms AT shihchunchiu guānliánguīzéyǎnsuànfǎzhīshízuòhéxiàonéngpínggū AT qiūshìjūn guānliánguīzéyǎnsuànfǎzhīshízuòhéxiàonéngpínggū
_version_	1717756291331391488

Implementataion and Performance Evaluation of Association Rule Mining Algorithms

Similar Items