Summary: | 碩士 === 淡江大學 === 資訊管理學系碩士班 === 96 === Sequential pattern mining technique is developed to determine time-related behavior in sequence databases. Most of the previous proposed methods discover frequent subsequences as patterns but do not consider the confidence issue. Besides, although the discovered sequential patterns can reveal the order of events, but the time between events is not well determined.
This dissertation presents, E-PrefixSpan, a new method for mining frequent and more confident association rules from sequential databases. The method is based on the PrefixSpan[20] algorithm. To take the advantage of the pattern-growth[21] mining approach and discover the time related sequential patterns, E-PrefixSpan records the time-intervals between items and creates projected databases to reduce the times of database scanning. Sequential pattern mining often generates a huge number of rules. To reduce the number of the correlated pattern without information loss, E-PrefixSpan applys the confidence pattern mining technique .
The proposed approach is compared to existing sequential pattern mining methods to show how they complement each other to discover association rules. Our performance study shows that E-PrefixSpan is a valuable approach to condense the correlated patterns and provide additional time-interval information for sequential pattern.
|