Early classification of multivariate temporal observations by extraction of interpretable shapelets

<p>Abstract</p> <p>Background</p> <p>Early classification of time series is beneficial for biomedical informatics problems such including, but not limited to, disease change detection. Early classification can be of tremendous help by identifying the onset of a disease...

Full description

Bibliographic Details
Main Authors: Ghalwash Mohamed F, Obradovic Zoran
Format: Article
Language:English
Published: BMC 2012-08-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/13/195
id doaj-f153c7bb3cfa42aab90b7645d593599f
record_format Article
spelling doaj-f153c7bb3cfa42aab90b7645d593599f2020-11-25T00:19:21ZengBMCBMC Bioinformatics1471-21052012-08-0113119510.1186/1471-2105-13-195Early classification of multivariate temporal observations by extraction of interpretable shapeletsGhalwash Mohamed FObradovic Zoran<p>Abstract</p> <p>Background</p> <p>Early classification of time series is beneficial for biomedical informatics problems such including, but not limited to, disease change detection. Early classification can be of tremendous help by identifying the onset of a disease before it has time to fully take hold. In addition, extracting patterns from the original time series helps domain experts to gain insights into the classification results. This problem has been studied recently using time series segments called <it>shapelets</it>. In this paper, we present a method, which we call <it>Multivariate Shapelets Detection (MSD)</it>, that allows for early and patient-specific classification of multivariate time series. The method extracts time series patterns, called <it>multivariate shapelets</it>, from all dimensions of the time series that distinctly manifest the target class locally. The time series were classified by searching for the earliest closest patterns.</p> <p>Results</p> <p>The proposed early classification method for multivariate time series has been evaluated on eight gene expression datasets from viral infection and drug response studies in humans. In our experiments, the MSD method outperformed the baseline methods, achieving highly accurate classification by using as little as 40%-64% of the time series. The obtained results provide evidence that using conventional classification methods on short time series is not as accurate as using the proposed methods specialized for early classification.</p> <p>Conclusion</p> <p>For the early classification task, we proposed a method called Multivariate Shapelets Detection (MSD), which extracts patterns from all dimensions of the time series. We showed that the MSD method can classify the time series early by using as little as 40%-64% of the time series’ length.</p> http://www.biomedcentral.com/1471-2105/13/195
collection DOAJ
language English
format Article
sources DOAJ
author Ghalwash Mohamed F
Obradovic Zoran
spellingShingle Ghalwash Mohamed F
Obradovic Zoran
Early classification of multivariate temporal observations by extraction of interpretable shapelets
BMC Bioinformatics
author_facet Ghalwash Mohamed F
Obradovic Zoran
author_sort Ghalwash Mohamed F
title Early classification of multivariate temporal observations by extraction of interpretable shapelets
title_short Early classification of multivariate temporal observations by extraction of interpretable shapelets
title_full Early classification of multivariate temporal observations by extraction of interpretable shapelets
title_fullStr Early classification of multivariate temporal observations by extraction of interpretable shapelets
title_full_unstemmed Early classification of multivariate temporal observations by extraction of interpretable shapelets
title_sort early classification of multivariate temporal observations by extraction of interpretable shapelets
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2012-08-01
description <p>Abstract</p> <p>Background</p> <p>Early classification of time series is beneficial for biomedical informatics problems such including, but not limited to, disease change detection. Early classification can be of tremendous help by identifying the onset of a disease before it has time to fully take hold. In addition, extracting patterns from the original time series helps domain experts to gain insights into the classification results. This problem has been studied recently using time series segments called <it>shapelets</it>. In this paper, we present a method, which we call <it>Multivariate Shapelets Detection (MSD)</it>, that allows for early and patient-specific classification of multivariate time series. The method extracts time series patterns, called <it>multivariate shapelets</it>, from all dimensions of the time series that distinctly manifest the target class locally. The time series were classified by searching for the earliest closest patterns.</p> <p>Results</p> <p>The proposed early classification method for multivariate time series has been evaluated on eight gene expression datasets from viral infection and drug response studies in humans. In our experiments, the MSD method outperformed the baseline methods, achieving highly accurate classification by using as little as 40%-64% of the time series. The obtained results provide evidence that using conventional classification methods on short time series is not as accurate as using the proposed methods specialized for early classification.</p> <p>Conclusion</p> <p>For the early classification task, we proposed a method called Multivariate Shapelets Detection (MSD), which extracts patterns from all dimensions of the time series. We showed that the MSD method can classify the time series early by using as little as 40%-64% of the time series’ length.</p>
url http://www.biomedcentral.com/1471-2105/13/195
work_keys_str_mv AT ghalwashmohamedf earlyclassificationofmultivariatetemporalobservationsbyextractionofinterpretableshapelets
AT obradoviczoran earlyclassificationofmultivariatetemporalobservationsbyextractionofinterpretableshapelets
_version_ 1725371875675078656