Efficient Algorithms for the Discovery of Frequent Superset
碩士 === 國立政治大學 === 資訊科學學系 === 93 === The algorithms for the discovery of frequent itemset have been investigated widely. These frequent itemsets are subsets of database. In this thesis, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinfor...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2004
|
Online Access: | http://ndltd.ncl.edu.tw/handle/60384605572101970408 |
id |
ndltd-TW-093NCCU5394001 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-093NCCU53940012015-10-13T12:56:36Z http://ndltd.ncl.edu.tw/handle/60384605572101970408 Efficient Algorithms for the Discovery of Frequent Superset 高效率常見超集合探勘演算法之研究 Liao, Zhung-Xun 廖忠訓 碩士 國立政治大學 資訊科學學系 93 The algorithms for the discovery of frequent itemset have been investigated widely. These frequent itemsets are subsets of database. In this thesis, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinformatics, E-learning systems, jobshop scheduling, and so on. A frequent superset means that the number of transactions contained in it is not less than minimum support threshold. Intuitively, according to the Apriori algorithm, the level-wise discovering starts from 1-itemset, 2-itemset, and so forth. However, such steps cannot utilize the property of Apriori to reduce search space, because if an itemset is not frequent, its superset maybe frequent. In order to solve this problem, we propose three methods. The first is the Apriori-based approach, called Apriori-C. The second is the Eclat-based approach, called Eclat-C, which is a depth-first approach. The last is the proposed data complement technique (DCT) that we utilize original frequent itemset mining approach to discover frequent superset. The experimental studies compare the performance of the proposed three methods by considering the effect of the number of transactions, the average length of transactions, the number of different items, and minimum support. The analysis shows that the proposed algorithms are time efficient and scalable. Shan, Man-Kwan 沈錳坤 2004 學位論文 ; thesis 71 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立政治大學 === 資訊科學學系 === 93 === The algorithms for the discovery of frequent itemset have been investigated widely. These frequent itemsets are subsets of database. In this thesis, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinformatics, E-learning systems, jobshop scheduling, and so on. A frequent superset means that the number of transactions contained in it is not less than minimum support threshold. Intuitively, according to the Apriori algorithm, the level-wise discovering starts from 1-itemset, 2-itemset, and so forth. However, such steps cannot utilize the property of Apriori to reduce search space, because if an itemset is not frequent, its superset maybe frequent. In order to solve this problem, we propose three methods. The first is the Apriori-based approach, called Apriori-C. The second is the Eclat-based approach, called Eclat-C, which is a depth-first approach. The last is the proposed data complement technique (DCT) that we utilize original frequent itemset mining approach to discover frequent superset.
The experimental studies compare the performance of the proposed three methods by considering the effect of the number of transactions, the average length of transactions, the number of different items, and minimum support. The analysis shows that the proposed algorithms are time efficient and scalable.
|
author2 |
Shan, Man-Kwan |
author_facet |
Shan, Man-Kwan Liao, Zhung-Xun 廖忠訓 |
author |
Liao, Zhung-Xun 廖忠訓 |
spellingShingle |
Liao, Zhung-Xun 廖忠訓 Efficient Algorithms for the Discovery of Frequent Superset |
author_sort |
Liao, Zhung-Xun |
title |
Efficient Algorithms for the Discovery of Frequent Superset |
title_short |
Efficient Algorithms for the Discovery of Frequent Superset |
title_full |
Efficient Algorithms for the Discovery of Frequent Superset |
title_fullStr |
Efficient Algorithms for the Discovery of Frequent Superset |
title_full_unstemmed |
Efficient Algorithms for the Discovery of Frequent Superset |
title_sort |
efficient algorithms for the discovery of frequent superset |
publishDate |
2004 |
url |
http://ndltd.ncl.edu.tw/handle/60384605572101970408 |
work_keys_str_mv |
AT liaozhungxun efficientalgorithmsforthediscoveryoffrequentsuperset AT liàozhōngxùn efficientalgorithmsforthediscoveryoffrequentsuperset AT liaozhungxun gāoxiàolǜchángjiànchāojíhétànkānyǎnsuànfǎzhīyánjiū AT liàozhōngxùn gāoxiàolǜchángjiànchāojíhétànkānyǎnsuànfǎzhīyánjiū |
_version_ |
1716869582419918848 |