Genetic Programming Approach for Nonstationary Data Analytics
Nonstationary data with concept drift occurring is usually made up of different underlying data generating processes. Therefore, if the knowledge of the existence of different segments in the dataset is not taken into consideration, then the induced predictive model is distorted by the past existing...
Main Author: | |
---|---|
Other Authors: | |
Language: | en |
Published: |
University of Pretoria
2021
|
Subjects: | |
Online Access: | http://hdl.handle.net/2263/79386 Kuranga, C 2021, Genetic Programming Approach for Nonstationary Data Analytics, PhD Thesis, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/79386> |
id |
ndltd-netd.ac.za-oai-union.ndltd.org-up-oai-repository.up.ac.za-2263-79386 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-netd.ac.za-oai-union.ndltd.org-up-oai-repository.up.ac.za-2263-793862021-10-14T05:09:32Z Genetic Programming Approach for Nonstationary Data Analytics Kuranga, Cry Pillay, Nelishia u13024303@tuks.co.za Computational Intelligence Machine learning UCTD Nonstationary data with concept drift occurring is usually made up of different underlying data generating processes. Therefore, if the knowledge of the existence of different segments in the dataset is not taken into consideration, then the induced predictive model is distorted by the past existing patterns. Thus, the challenge posed to a regressor is to select an appropriate segment that depicts the current underlying data generating process to be used in a model induction. The proposed genetic programming approach for nonstationary data analytics (GPANDA) provides a piecewise nonlinear regression model for nonstationary data. The GPANDA consists of three components: dynamic differential evolution-based clustering algorithm to split the parameter space into subspaces that resemble different data generating processes present in the dataset; the dynamic particle swarm optimization-based model induction technique to induce nonlinear models that describe each generated cluster; and dynamic genetic programming that evolves model trees that define the boundaries of nonlinear models which are expressed as terminal nodes. If an environmental change is detected in a nonstationary dataset, a dynamic differential evolution-based clustering algorithm clusters the data. For the clusters that change, the dynamic particle swarm optimization-based model induction approach adapts nonlinear models or induces new models to create an updated genetic programming terminal set and then, purple the genetic programming evolves a piecewise predictive model to fit the dataset. To evaluate the effectiveness of GPANDA, experimental evaluations were conducted on both artificial and real-world datasets. Two stock market datasets, GDP and CPI were selected to benchmark the performance of the proposed model to the leading studies. GPANDA outperformed the genetic programming algorithms designed for dynamic environments and was competitive to the state-of-art-techniques. Thesis (PhD)--University of Pretoria, 2020. UP Postgraduate Research Bursary Computer Science PhD Unrestricted 2021-04-12T08:14:21Z 2021-04-12T08:14:21Z 2021-04-20 2021-02-16 Thesis http://hdl.handle.net/2263/79386 Kuranga, C 2021, Genetic Programming Approach for Nonstationary Data Analytics, PhD Thesis, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/79386> en © 2019 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. University of Pretoria |
collection |
NDLTD |
language |
en |
sources |
NDLTD |
topic |
Computational Intelligence Machine learning UCTD |
spellingShingle |
Computational Intelligence Machine learning UCTD Kuranga, Cry Genetic Programming Approach for Nonstationary Data Analytics |
description |
Nonstationary data with concept drift occurring is usually made up of different underlying data generating processes. Therefore, if the knowledge of the existence of different segments in the dataset is not taken into consideration, then the induced predictive model is distorted by the past existing patterns. Thus, the challenge posed to a regressor is to select an appropriate segment that depicts the current underlying data generating process to be used in a model induction. The proposed genetic programming approach for nonstationary data analytics (GPANDA) provides a piecewise nonlinear regression model for nonstationary data. The GPANDA consists of three components: dynamic differential evolution-based clustering algorithm to split the parameter space into subspaces that resemble different data generating processes present in the dataset; the dynamic particle swarm optimization-based model induction technique to induce nonlinear models that describe each generated cluster;
and dynamic genetic programming that evolves model trees that define the boundaries of nonlinear models which are expressed as terminal nodes.
If an environmental change is detected in a nonstationary dataset, a dynamic differential evolution-based clustering algorithm clusters the data. For the clusters that change, the dynamic particle swarm optimization-based model induction approach adapts nonlinear models or induces new models to create an updated genetic programming terminal set and then, purple the genetic programming evolves a piecewise predictive model to fit the dataset.
To evaluate the effectiveness of GPANDA, experimental evaluations were conducted on both artificial and real-world datasets. Two stock market datasets, GDP and CPI were selected to benchmark the performance of the proposed model to the leading studies. GPANDA outperformed the genetic programming algorithms designed for dynamic environments and was competitive to the state-of-art-techniques. === Thesis (PhD)--University of Pretoria, 2020. === UP Postgraduate Research Bursary === Computer Science === PhD === Unrestricted |
author2 |
Pillay, Nelishia |
author_facet |
Pillay, Nelishia Kuranga, Cry |
author |
Kuranga, Cry |
author_sort |
Kuranga, Cry |
title |
Genetic Programming Approach for Nonstationary Data Analytics |
title_short |
Genetic Programming Approach for Nonstationary Data Analytics |
title_full |
Genetic Programming Approach for Nonstationary Data Analytics |
title_fullStr |
Genetic Programming Approach for Nonstationary Data Analytics |
title_full_unstemmed |
Genetic Programming Approach for Nonstationary Data Analytics |
title_sort |
genetic programming approach for nonstationary data analytics |
publisher |
University of Pretoria |
publishDate |
2021 |
url |
http://hdl.handle.net/2263/79386 Kuranga, C 2021, Genetic Programming Approach for Nonstationary Data Analytics, PhD Thesis, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/79386> |
work_keys_str_mv |
AT kurangacry geneticprogrammingapproachfornonstationarydataanalytics |
_version_ |
1719489716352450560 |