Shapelet transforms for univariate and multivariate time series classification

Time Series Classification (TSC) is a growing field of machine learning research. One particular algorithm from the TSC literature is the Shapelet Transform (ST). Shapelets are a phase independent subsequences that are extracted from times series to form discriminatory features. It has been shown th...

Full description

Bibliographic Details
Main Author: Bostrom, Aaron
Published: University of East Anglia 2018
Subjects:
004
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.743360
id ndltd-bl.uk-oai-ethos.bl.uk-743360
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-7433602019-03-05T15:45:07ZShapelet transforms for univariate and multivariate time series classificationBostrom, Aaron2018Time Series Classification (TSC) is a growing field of machine learning research. One particular algorithm from the TSC literature is the Shapelet Transform (ST). Shapelets are a phase independent subsequences that are extracted from times series to form discriminatory features. It has been shown that using the shapelets to transform the datasets into a new space can improve performance. One of the major problems with ST, is that the algorithm is O(n2m4), where n is the number of time series and m is the length of the series. As a problem increases in sizes, or additional dimensions are added, the algorithm quickly becomes computationally infeasible. The research question addressed is whether the shapelet transform be improved in terms of accuracy and speed. Making algorithmic improvements to shapelets will enable the development of multivariate shapelet algorithms that can attempt to solve much larger problems in realistic time frames. In support of this thesis a new distance early abandon method is proposed. A class balancing algorithm is implemented, which uses a one vs. all multi class information gain that enables heuristics which were developed for two class problems. To support these improvements a large scale analysis of the best shapelet algorithms is conducted as part of a larger experimental evaluation. ST is proven to be one of the most accurate algorithms in TSC on the UCR-UEA datasets. Contract classification is proposed for shapelets, where a fixed run time is set, and the number of shapelets is bounded. Four search algorithms are evaluated with fixed run times of one hour and one day, three of which are not significantly worse than a full enumeration. Finally, three multivariate shapelet algorithms are developed and compared to benchmark results and multivariate dynamic time warping.004University of East Angliahttps://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.743360https://ueaeprints.uea.ac.uk/67270/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Bostrom, Aaron
Shapelet transforms for univariate and multivariate time series classification
description Time Series Classification (TSC) is a growing field of machine learning research. One particular algorithm from the TSC literature is the Shapelet Transform (ST). Shapelets are a phase independent subsequences that are extracted from times series to form discriminatory features. It has been shown that using the shapelets to transform the datasets into a new space can improve performance. One of the major problems with ST, is that the algorithm is O(n2m4), where n is the number of time series and m is the length of the series. As a problem increases in sizes, or additional dimensions are added, the algorithm quickly becomes computationally infeasible. The research question addressed is whether the shapelet transform be improved in terms of accuracy and speed. Making algorithmic improvements to shapelets will enable the development of multivariate shapelet algorithms that can attempt to solve much larger problems in realistic time frames. In support of this thesis a new distance early abandon method is proposed. A class balancing algorithm is implemented, which uses a one vs. all multi class information gain that enables heuristics which were developed for two class problems. To support these improvements a large scale analysis of the best shapelet algorithms is conducted as part of a larger experimental evaluation. ST is proven to be one of the most accurate algorithms in TSC on the UCR-UEA datasets. Contract classification is proposed for shapelets, where a fixed run time is set, and the number of shapelets is bounded. Four search algorithms are evaluated with fixed run times of one hour and one day, three of which are not significantly worse than a full enumeration. Finally, three multivariate shapelet algorithms are developed and compared to benchmark results and multivariate dynamic time warping.
author Bostrom, Aaron
author_facet Bostrom, Aaron
author_sort Bostrom, Aaron
title Shapelet transforms for univariate and multivariate time series classification
title_short Shapelet transforms for univariate and multivariate time series classification
title_full Shapelet transforms for univariate and multivariate time series classification
title_fullStr Shapelet transforms for univariate and multivariate time series classification
title_full_unstemmed Shapelet transforms for univariate and multivariate time series classification
title_sort shapelet transforms for univariate and multivariate time series classification
publisher University of East Anglia
publishDate 2018
url https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.743360
work_keys_str_mv AT bostromaaron shapelettransformsforunivariateandmultivariatetimeseriesclassification
_version_ 1718996530943229952