Summary: | Attribute-Oriented Induction of High-level Emerging Pattern(AOI-HEP) is a combination of Attribute Oriented Induction (AOI) and Emerging Patterns (EP). AOI is a summarisation algorithm that compact a given dataset into small conceptual descriptions, where each attribute has a defined concept hierarchy. This presents patterns are easily readable and understandable. Emerging patterns are patterns discovered between two datasets and between two time periods such that patterns found in the first dataset have either grown (or reduced) in size, totally disappeared or new ones have emerged. AOI-HEP is not influenced by border-based algorithm like in EP mining algorithms. It is desirable therefore that we obtain summarised emerging patterns between two datasets. We propose High-level Emerging Pattern (HEP) algorithm. The main purpose of combining AOI and EP is to use the typical strength of AOI and EP to extract important high-level emerging patterns from data. The AOI characteristic rule algorithm was run twice with two input datasets,to create two rulesets which are then processed with the HEP algorithm. Firstly, the HEP algorithm starts with cartesian product between two rulesets which eliminates rules in rulesets by computing similarity metric (a categorization of attribute comparisons). Secondly, the output rules between two rulesets from the metric similarity are discriminated by computing a growth rate value to find ratio of supports between rules from two rulesets. The categorization of attribute comparisons is based on similarity hierarchy level. The categorisation of attributes was found to be with three options in how they subsume each other. These were Total Subsumption HEP (TSHEP), Subsumption Overlapping HEP (SOHEP) and Total Overlapping HEP (TOHEP) patterns. Meanwhile, from certain similarity hierarchy level and values, we can mine frequent and similar patterns that create discriminant rules. We used four large real datasets from UCI machine learning repository and discovered valuable HEP patterns including strong discriminant rules, frequent and similar patterns. Moreover, the experiments showed that most datasets have SOHEP but not TSHEP and TOHEP and the most rarely found were TOHEP. Since AOI- iii HEP can strongly discriminate high-level data, assuredly AOI-HEP can be implemented to discriminate datasets such as finding bad and good customers for banking loan systems or credit card applicants etc. Moreover, AOI-HEP can be implemented to mine similar patterns, for instance, mining similar customer loan patterns etc.
|