Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets

As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawb...

Full description

Bibliographic Details
Main Authors: Feng Zhao, Islem Rekik, Seong-Whan Lee, Jing Liu, Junying Zhang, Dinggang Shen
Format: Article
Language:English
Published: Hindawi-Wiley 2019-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2019/5937274
id doaj-e27b58454a0f4cc5abe0e6d3904d37dd
record_format Article
spelling doaj-e27b58454a0f4cc5abe0e6d3904d37dd2020-11-25T00:56:39ZengHindawi-WileyComplexity1076-27871099-05262019-01-01201910.1155/2019/59372745937274Two-Phase Incremental Kernel PCA for Learning Massive or Online DatasetsFeng Zhao0Islem Rekik1Seong-Whan Lee2Jing Liu3Junying Zhang4Dinggang Shen5School of Computer Science and Technology, Shandong Technology and Business University, Yantai, ChinaBASIRA Lab, Faculty of Computer and Informatics, Istanbul Technical University, Istanbul, TurkeyDepartment of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of KoreaSchool of Electronic Engineering, Xian University of Posts and Telecommunications, Xi’an, ChinaSchool of Computer Science and Engineering, Xidian University, Xi’an, ChinaDepartment of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of KoreaAs a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.http://dx.doi.org/10.1155/2019/5937274
collection DOAJ
language English
format Article
sources DOAJ
author Feng Zhao
Islem Rekik
Seong-Whan Lee
Jing Liu
Junying Zhang
Dinggang Shen
spellingShingle Feng Zhao
Islem Rekik
Seong-Whan Lee
Jing Liu
Junying Zhang
Dinggang Shen
Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets
Complexity
author_facet Feng Zhao
Islem Rekik
Seong-Whan Lee
Jing Liu
Junying Zhang
Dinggang Shen
author_sort Feng Zhao
title Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets
title_short Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets
title_full Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets
title_fullStr Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets
title_full_unstemmed Two-Phase Incremental Kernel PCA for Learning Massive or Online Datasets
title_sort two-phase incremental kernel pca for learning massive or online datasets
publisher Hindawi-Wiley
series Complexity
issn 1076-2787
1099-0526
publishDate 2019-01-01
description As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available.
url http://dx.doi.org/10.1155/2019/5937274
work_keys_str_mv AT fengzhao twophaseincrementalkernelpcaforlearningmassiveoronlinedatasets
AT islemrekik twophaseincrementalkernelpcaforlearningmassiveoronlinedatasets
AT seongwhanlee twophaseincrementalkernelpcaforlearningmassiveoronlinedatasets
AT jingliu twophaseincrementalkernelpcaforlearningmassiveoronlinedatasets
AT junyingzhang twophaseincrementalkernelpcaforlearningmassiveoronlinedatasets
AT dinggangshen twophaseincrementalkernelpcaforlearningmassiveoronlinedatasets
_version_ 1725226198770909184