Efficient Algorithms for the Discovery of Frequent Superset

碩士 === 國立政治大學 === 資訊科學學系 === 93 === The algorithms for the discovery of frequent itemset have been investigated widely. These frequent itemsets are subsets of database. In this thesis, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinfor...

Full description

Bibliographic Details
Main Authors: Liao, Zhung-Xun, 廖忠訓
Other Authors: Shan, Man-Kwan
Format: Others
Language:en_US
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/60384605572101970408
id ndltd-TW-093NCCU5394001
record_format oai_dc
spelling ndltd-TW-093NCCU53940012015-10-13T12:56:36Z http://ndltd.ncl.edu.tw/handle/60384605572101970408 Efficient Algorithms for the Discovery of Frequent Superset 高效率常見超集合探勘演算法之研究 Liao, Zhung-Xun 廖忠訓 碩士 國立政治大學 資訊科學學系 93 The algorithms for the discovery of frequent itemset have been investigated widely. These frequent itemsets are subsets of database. In this thesis, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinformatics, E-learning systems, jobshop scheduling, and so on. A frequent superset means that the number of transactions contained in it is not less than minimum support threshold. Intuitively, according to the Apriori algorithm, the level-wise discovering starts from 1-itemset, 2-itemset, and so forth. However, such steps cannot utilize the property of Apriori to reduce search space, because if an itemset is not frequent, its superset maybe frequent. In order to solve this problem, we propose three methods. The first is the Apriori-based approach, called Apriori-C. The second is the Eclat-based approach, called Eclat-C, which is a depth-first approach. The last is the proposed data complement technique (DCT) that we utilize original frequent itemset mining approach to discover frequent superset. The experimental studies compare the performance of the proposed three methods by considering the effect of the number of transactions, the average length of transactions, the number of different items, and minimum support. The analysis shows that the proposed algorithms are time efficient and scalable. Shan, Man-Kwan 沈錳坤 2004 學位論文 ; thesis 71 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立政治大學 === 資訊科學學系 === 93 === The algorithms for the discovery of frequent itemset have been investigated widely. These frequent itemsets are subsets of database. In this thesis, we propose a novel mining task: mining frequent superset from the database of itemsets that is useful in bioinformatics, E-learning systems, jobshop scheduling, and so on. A frequent superset means that the number of transactions contained in it is not less than minimum support threshold. Intuitively, according to the Apriori algorithm, the level-wise discovering starts from 1-itemset, 2-itemset, and so forth. However, such steps cannot utilize the property of Apriori to reduce search space, because if an itemset is not frequent, its superset maybe frequent. In order to solve this problem, we propose three methods. The first is the Apriori-based approach, called Apriori-C. The second is the Eclat-based approach, called Eclat-C, which is a depth-first approach. The last is the proposed data complement technique (DCT) that we utilize original frequent itemset mining approach to discover frequent superset. The experimental studies compare the performance of the proposed three methods by considering the effect of the number of transactions, the average length of transactions, the number of different items, and minimum support. The analysis shows that the proposed algorithms are time efficient and scalable.
author2 Shan, Man-Kwan
author_facet Shan, Man-Kwan
Liao, Zhung-Xun
廖忠訓
author Liao, Zhung-Xun
廖忠訓
spellingShingle Liao, Zhung-Xun
廖忠訓
Efficient Algorithms for the Discovery of Frequent Superset
author_sort Liao, Zhung-Xun
title Efficient Algorithms for the Discovery of Frequent Superset
title_short Efficient Algorithms for the Discovery of Frequent Superset
title_full Efficient Algorithms for the Discovery of Frequent Superset
title_fullStr Efficient Algorithms for the Discovery of Frequent Superset
title_full_unstemmed Efficient Algorithms for the Discovery of Frequent Superset
title_sort efficient algorithms for the discovery of frequent superset
publishDate 2004
url http://ndltd.ncl.edu.tw/handle/60384605572101970408
work_keys_str_mv AT liaozhungxun efficientalgorithmsforthediscoveryoffrequentsuperset
AT liàozhōngxùn efficientalgorithmsforthediscoveryoffrequentsuperset
AT liaozhungxun gāoxiàolǜchángjiànchāojíhétànkānyǎnsuànfǎzhīyánjiū
AT liàozhōngxùn gāoxiàolǜchángjiànchāojíhétànkānyǎnsuànfǎzhīyánjiū
_version_ 1716869582419918848