Summary: | 碩士 === 國立高雄大學 === 資訊工程學系碩士班 === 104 === Erasable-itemset mining is a new and interesting problem suitable for factory production planning. It is to find the itemsets (components) that can be eliminated if the products generated from them gain profit under a given threshold. Erasable itemsets can be used when a factory needs to renew products or production needs downsizing and can still keep operation and gain profit. A company may have several factories, each of which may derive its own erasable itemsets at a time period. A manager of the company needs to know the overall erasable itemsets integrated from all the factories. In this thesis, we thus consider the erasable-itemset integration to merge the erasable itemsets from multi-sources. It is based on the known erasable itemsets in each factory as a reference information to reduce the rescan of the individidual data sources. We start from two-factory erasable-itemset merging and propose an efficient integration approach. Itemsets are classified into erasable and non-erasable, and thus four cases can be derived for an itemset in two factories. We can thus get the merged erasable itemset directly or rescan partial data sources to reduce mining time according to the cases. Besides, the proposed two-factory integration approach can be further extended to process more than two sets of erasable itemsets. Four experiments are made and their results show that the proposed algorithm executes faster than the batch approach in the multiple data-source environment for erasable itemsets.
|