Summary: | 碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 96 === A time series is composed of lots of data points, each of which represents a value at a certain time. Many phenomena can be represented by time series, such as electrocardiograms in medical science, gene expressions in biology and video data in multimedia. Time series have thus been an important and interesting research field due to their frequent appearance in different applications. It is related to many research topics, including anomaly detection, similarity measurement, dimension reduction and segmentation, among others. In this thesis, we proposed a time series segmentation approach by combining the clustering technique, the discrete wavelet transformation and the genetic algorithm to automatically find segments and patterns from a time series and reduce the raised problems in previous approach. The first one is that it may cause distortion of segments when using the discrete wavelet transformation (DWT) to adjust the length of the subsequences. The second one is that if a group contains only one segment then it may result in a less meaningful pattern. The proposed approach first divides the segments in a chromosome into k groups according to their slopes by using clustering techniques. In order to deal with these problems, two factors, namely the density factor and the distortion factor, are used to solve them. The distortion factor is used to avoid the distortion of the segments and the density factor is used to avoid generation of meaningless patterns. The fitness value of a chromosome is then evaluated by the distances of segments and these two factors. Experimental results on real financial datasets in Taiwan also show the effectiveness of the proposed approach.
|