Summary: | 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 97 === Previous studies on mining repeating patterns focus on discovering sub-strings which appear frequently in a long string, converted from the music. An example of such repeating pattern is ”if the stock price of companies A and B both goes up on day one, the stock price of company C will go up on exactly day fifth.” But the problem proposed by Tung gives too much limitation for mining repeating patterns from set sequence, the potential frequent patterns can not be found due to the frequencies distrusted. Hence, in our paper we define a new pattern, which allows the gap between two adjacent sets, and propose an algorithm, G-Apriori, to discover the repeating patterns with gap constraint from a set sequence. G-Apriori algorithm generates candidates and counts the frequency of these candidates by scanning the database. In order to avoid scanning the database so many times, the algorithm, GwI-Apriori is proposed to solve the problem. In GwI-Apriori method, it designs an index list, which contains the start position (SP) and end position (EP) list, for recording the positions of the frequent patterns. Besides, the GwI-Apriori also takes the additional strategy for pruning the searching space among the index lists. By using the index lists, the GwI-Apriori only scans the database once and computes the frequency of frequent patterns through the index lists. The experimental results show that the GwI-Apriori performs much better than G-Apriori.
|