A Methodology for Validating Diversity in Synthetic Time Series Generation

In order for researchers to deliver robust evaluations of time series models, it often requires high volumes of data to ensure the appropriate level of rigor in testing. However, for many researchers, the lack of time series presents a barrier to a deeper evaluation. While researchers have developed...

Full description

Bibliographic Details
Main Authors: Fouad Bahrpeyma, Mark Roantree, Paolo Cappellari, Michael Scriney, Andrew McCarren
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:MethodsX
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2215016121002521
id doaj-677b2700240a4604bb195288aade4b10
record_format Article
spelling doaj-677b2700240a4604bb195288aade4b102021-08-06T04:21:55ZengElsevierMethodsX2215-01612021-01-018101459A Methodology for Validating Diversity in Synthetic Time Series GenerationFouad Bahrpeyma0Mark Roantree1Paolo Cappellari2Michael Scriney3Andrew McCarren4Corresponding author.; Insight Centre for Data Analytics, School of Computing, Dublin City University, Dublin 9, IrelandVistaMilk SFI Research Centre, Dublin City University, Dublin 9, IrelandCity University of New York, 2800 Victory Blvd, Staten Island, 10314 NY, USAInsight Centre for Data Analytics, School of Computing, Dublin City University, Dublin 9, IrelandInsight Centre for Data Analytics, School of Computing, Dublin City University, Dublin 9, IrelandIn order for researchers to deliver robust evaluations of time series models, it often requires high volumes of data to ensure the appropriate level of rigor in testing. However, for many researchers, the lack of time series presents a barrier to a deeper evaluation. While researchers have developed and used synthetic datasets, the development of this data requires a methodological approach to testing the entire dataset against a set of metrics which capture the diversity of the dataset. Unless researchers are confident that their test datasets display a broad set of time series characteristics, it may favor one type of predictive model over another. This can have the effect of undermining the evaluation of new predictive methods. In this paper, we present a new approach to generating and evaluating a high number of time series data. The construction algorithm and validation framework are described in detail, together with an analysis of the level of diversity present in the synthetic dataset.http://www.sciencedirect.com/science/article/pii/S2215016121002521Synthetic time seriesTime series featuresDiversityCoverageForecasting
collection DOAJ
language English
format Article
sources DOAJ
author Fouad Bahrpeyma
Mark Roantree
Paolo Cappellari
Michael Scriney
Andrew McCarren
spellingShingle Fouad Bahrpeyma
Mark Roantree
Paolo Cappellari
Michael Scriney
Andrew McCarren
A Methodology for Validating Diversity in Synthetic Time Series Generation
MethodsX
Synthetic time series
Time series features
Diversity
Coverage
Forecasting
author_facet Fouad Bahrpeyma
Mark Roantree
Paolo Cappellari
Michael Scriney
Andrew McCarren
author_sort Fouad Bahrpeyma
title A Methodology for Validating Diversity in Synthetic Time Series Generation
title_short A Methodology for Validating Diversity in Synthetic Time Series Generation
title_full A Methodology for Validating Diversity in Synthetic Time Series Generation
title_fullStr A Methodology for Validating Diversity in Synthetic Time Series Generation
title_full_unstemmed A Methodology for Validating Diversity in Synthetic Time Series Generation
title_sort methodology for validating diversity in synthetic time series generation
publisher Elsevier
series MethodsX
issn 2215-0161
publishDate 2021-01-01
description In order for researchers to deliver robust evaluations of time series models, it often requires high volumes of data to ensure the appropriate level of rigor in testing. However, for many researchers, the lack of time series presents a barrier to a deeper evaluation. While researchers have developed and used synthetic datasets, the development of this data requires a methodological approach to testing the entire dataset against a set of metrics which capture the diversity of the dataset. Unless researchers are confident that their test datasets display a broad set of time series characteristics, it may favor one type of predictive model over another. This can have the effect of undermining the evaluation of new predictive methods. In this paper, we present a new approach to generating and evaluating a high number of time series data. The construction algorithm and validation framework are described in detail, together with an analysis of the level of diversity present in the synthetic dataset.
topic Synthetic time series
Time series features
Diversity
Coverage
Forecasting
url http://www.sciencedirect.com/science/article/pii/S2215016121002521
work_keys_str_mv AT fouadbahrpeyma amethodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT markroantree amethodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT paolocappellari amethodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT michaelscriney amethodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT andrewmccarren amethodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT fouadbahrpeyma methodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT markroantree methodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT paolocappellari methodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT michaelscriney methodologyforvalidatingdiversityinsynthetictimeseriesgeneration
AT andrewmccarren methodologyforvalidatingdiversityinsynthetictimeseriesgeneration
_version_ 1721219533723664384