Mining Complete Linguistic Itemsets Based on Tree Structures

碩士 === 國立高雄大學 === 資訊工程學系碩士班 === 99 === Information technology (IT) has recently progressed very rapidly, and the capacity to process and store data in databases has substantially grown. Extraction of implicit information from a lot of data to aid decision making has thus become a new challenge. Data...

Full description

Bibliographic Details
Main Authors: Tsung-Ching Lin, 林宗慶
Other Authors: Tzung-Pei Hong
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/27694542080058570523
Description
Summary:碩士 === 國立高雄大學 === 資訊工程學系碩士班 === 99 === Information technology (IT) has recently progressed very rapidly, and the capacity to process and store data in databases has substantially grown. Extraction of implicit information from a lot of data to aid decision making has thus become a new challenge. Data mining technology is usually used to discover useful information and knowledge from large databases. In data-mining research areas, finding association rules is considered as one of the most common topics. In the past, many algorithms of mining association rules have been proposed. Most of them focused on processing only binary variables in databases. Transactions with quantitative values are, however, commonly seen in real-world applications. The fuzzy-set theory is then used for efficiently handling them and deriving linguistic association rules. In this thesis, three algorithms for deriving complete fuzzy frequent itemsets from quantitative databases are proposed. They are multiple fuzzy FP-tree (MFFP-tree) algorithm, compressed multiple fuzzy FP-tree (CMFFP-tree) algorithm, and upper-bound multiple fuzzy FP-tree (UBMFFP-tree) algorithm, respectively. In all the three algorithms above, more than one fuzzy region, instead of only one region, are used to represent an item, thus being able to derive complete fuzzy frequent itemsets. Experiments are also made to compare the performance of the three proposed algorithms. The experimental results show that the proposed algorithms can achieve a good trade-off between execution time and tree complexity. In addition, we propose an integrated MFFP (iMFFP) tree algorithm for merging several individual MFFP trees into an integrated one. It can help derive global association rules from distributed databases.