A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation

In this work, a cross-validation procedure is used to identify an appropriate Autoregressive Integrated Moving Average model and an appropriate state space model for a time series. A minimum size for the training set is specified. The procedure is based on one-step forecasts and uses different train...

Full description

Bibliographic Details
Main Authors: Patrícia Ramos, José Manuel Oliveira
Format: Article
Language:English
Published: MDPI AG 2016-11-01
Series:Algorithms
Subjects:
Online Access:http://www.mdpi.com/1999-4893/9/4/76
id doaj-5dff73dfc3754d7f8bb1f3f15d467c46
record_format Article
spelling doaj-5dff73dfc3754d7f8bb1f3f15d467c462020-11-24T22:31:24ZengMDPI AGAlgorithms1999-48932016-11-01947610.3390/a9040076a9040076A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-ValidationPatrícia Ramos0José Manuel Oliveira1INESC Technology and Science, Manufacturing Systems Engineering Unit, 4200-465 Porto, PortugalINESC Technology and Science, Manufacturing Systems Engineering Unit, 4200-465 Porto, PortugalIn this work, a cross-validation procedure is used to identify an appropriate Autoregressive Integrated Moving Average model and an appropriate state space model for a time series. A minimum size for the training set is specified. The procedure is based on one-step forecasts and uses different training sets, each containing one more observation than the previous one. All possible state space models and all ARIMA models where the orders are allowed to range reasonably are fitted considering raw data and log-transformed data with regular differencing (up to second order differences) and, if the time series is seasonal, seasonal differencing (up to first order differences). The value of root mean squared error for each model is calculated averaging the one-step forecasts obtained. The model which has the lowest root mean squared error value and passes the Ljung–Box test using all of the available data with a reasonable significance level is selected among all the ARIMA and state space models considered. The procedure is exemplified in this paper with a case study of retail sales of different categories of women’s footwear from a Portuguese retailer, and its accuracy is compared with three reliable forecasting approaches. The results show that our procedure consistently forecasts more accurately than the other approaches and the improvements in the accuracy are significant.http://www.mdpi.com/1999-4893/9/4/76model identificationstate space modelsARIMA modelsforecastingretailing
collection DOAJ
language English
format Article
sources DOAJ
author Patrícia Ramos
José Manuel Oliveira
spellingShingle Patrícia Ramos
José Manuel Oliveira
A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation
Algorithms
model identification
state space models
ARIMA models
forecasting
retailing
author_facet Patrícia Ramos
José Manuel Oliveira
author_sort Patrícia Ramos
title A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation
title_short A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation
title_full A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation
title_fullStr A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation
title_full_unstemmed A Procedure for Identification of Appropriate State Space and ARIMA Models Based on Time-Series Cross-Validation
title_sort procedure for identification of appropriate state space and arima models based on time-series cross-validation
publisher MDPI AG
series Algorithms
issn 1999-4893
publishDate 2016-11-01
description In this work, a cross-validation procedure is used to identify an appropriate Autoregressive Integrated Moving Average model and an appropriate state space model for a time series. A minimum size for the training set is specified. The procedure is based on one-step forecasts and uses different training sets, each containing one more observation than the previous one. All possible state space models and all ARIMA models where the orders are allowed to range reasonably are fitted considering raw data and log-transformed data with regular differencing (up to second order differences) and, if the time series is seasonal, seasonal differencing (up to first order differences). The value of root mean squared error for each model is calculated averaging the one-step forecasts obtained. The model which has the lowest root mean squared error value and passes the Ljung–Box test using all of the available data with a reasonable significance level is selected among all the ARIMA and state space models considered. The procedure is exemplified in this paper with a case study of retail sales of different categories of women’s footwear from a Portuguese retailer, and its accuracy is compared with three reliable forecasting approaches. The results show that our procedure consistently forecasts more accurately than the other approaches and the improvements in the accuracy are significant.
topic model identification
state space models
ARIMA models
forecasting
retailing
url http://www.mdpi.com/1999-4893/9/4/76
work_keys_str_mv AT patriciaramos aprocedureforidentificationofappropriatestatespaceandarimamodelsbasedontimeseriescrossvalidation
AT josemanueloliveira aprocedureforidentificationofappropriatestatespaceandarimamodelsbasedontimeseriescrossvalidation
AT patriciaramos procedureforidentificationofappropriatestatespaceandarimamodelsbasedontimeseriescrossvalidation
AT josemanueloliveira procedureforidentificationofappropriatestatespaceandarimamodelsbasedontimeseriescrossvalidation
_version_ 1725737262132494336