Summary: | 碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 93 === As the modern applications in various kinds of domains, such as multimedia, bioinformatics, finance and science, intensively increase, an efficient method becomes extremely important for retrieving useful knowledge from time-series data. Those kinds of information are usually high-dimensionality and involve huge amount of data, such that many researchers use the approximation-like methods to reduce the dimensionality of the data for performance improvement. The main concepts of those popular solutions are to transform the original data into some representatives and use them in later analysis. We proposed the techniques which have good quality in searching similar subsequences, although most approximation-like methods always lead to the increasing error rate. In this paper, we focus on the efficient method of similar subsequences searching which both consider the balance between performance and accuracy and give the ability to find patterns with different domain knowledge, like negative effect in time-series microarray Data. We proposed a solution which uses symbolic method for searching similar subsequences, and integrate the advantages of other methods. The experiments on biological data show that the scalability compared to Agrawal’s and Time-lagged method is much better.
|