Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM

High-dimensional and unbalanced data anomaly detection is common. Effective anomaly detection is essential for problem or disaster early warning and maintaining system reliability. A significant research issue related to the data analysis of the sensor is the detection of anomalies. The anomaly dete...

Full description

Bibliographic Details
Main Authors: Fuguang Bao, Yongqiang Wu, Zhaogang Li, Yongzhao Li, Lili Liu, Guanyu Chen
Format: Article
Language:English
Published: Hindawi-Wiley 2020-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2020/9084704
id doaj-692df8c6c7cd4442a07b754c0b8ed357
record_format Article
spelling doaj-692df8c6c7cd4442a07b754c0b8ed3572020-11-25T01:23:06ZengHindawi-WileyComplexity1076-27871099-05262020-01-01202010.1155/2020/90847049084704Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTMFuguang Bao0Yongqiang Wu1Zhaogang Li2Yongzhao Li3Lili Liu4Guanyu Chen5Contemporary Business and Trade Research Center, Zhejiang Gongshang University, Hangzhou 310018, ChinaZhejiang Wellsun Intelligent Technology Co.,Ltd., Hangzhou 310018, ChinaZhejiang Wellsun Intelligent Technology Co.,Ltd., Hangzhou 310018, ChinaZhejiang Wellsun Intelligent Technology Co.,Ltd., Hangzhou 310018, ChinaZhejiang Wellsun Intelligent Technology Co.,Ltd., Hangzhou 310018, ChinaSchool of Management Science & Engineering, Zhejiang Gongshang University, Hangzhou 310018, ChinaHigh-dimensional and unbalanced data anomaly detection is common. Effective anomaly detection is essential for problem or disaster early warning and maintaining system reliability. A significant research issue related to the data analysis of the sensor is the detection of anomalies. The anomaly detection is essentially an unbalanced sequence binary classification. The data of this type contains characteristics of large scale, high complex computation, unbalanced data distribution, and sequence relationship among data. This paper uses long short-term memory networks (LSTMs) combined with historical sequence data; also, it integrates the synthetic minority oversampling technique (SMOTE) algorithm and K-nearest neighbors (kNN), and it designs and constructs an anomaly detection network model based on kNN-SMOTE-LSTM in accordance with the data characteristic of being unbalanced. This model can continuously filter out and securely generate samples to improve the performance of the model through kNN discriminant classifier and avoid the blindness and limitations of the SMOTE algorithm in generating new samples. The experiments demonstrated that the structured kNN-SMOTE-LSTM model can significantly improve the performance of the unbalanced sequence binary classification.http://dx.doi.org/10.1155/2020/9084704
collection DOAJ
language English
format Article
sources DOAJ
author Fuguang Bao
Yongqiang Wu
Zhaogang Li
Yongzhao Li
Lili Liu
Guanyu Chen
spellingShingle Fuguang Bao
Yongqiang Wu
Zhaogang Li
Yongzhao Li
Lili Liu
Guanyu Chen
Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM
Complexity
author_facet Fuguang Bao
Yongqiang Wu
Zhaogang Li
Yongzhao Li
Lili Liu
Guanyu Chen
author_sort Fuguang Bao
title Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM
title_short Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM
title_full Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM
title_fullStr Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM
title_full_unstemmed Effect Improved for High-Dimensional and Unbalanced Data Anomaly Detection Model Based on KNN-SMOTE-LSTM
title_sort effect improved for high-dimensional and unbalanced data anomaly detection model based on knn-smote-lstm
publisher Hindawi-Wiley
series Complexity
issn 1076-2787
1099-0526
publishDate 2020-01-01
description High-dimensional and unbalanced data anomaly detection is common. Effective anomaly detection is essential for problem or disaster early warning and maintaining system reliability. A significant research issue related to the data analysis of the sensor is the detection of anomalies. The anomaly detection is essentially an unbalanced sequence binary classification. The data of this type contains characteristics of large scale, high complex computation, unbalanced data distribution, and sequence relationship among data. This paper uses long short-term memory networks (LSTMs) combined with historical sequence data; also, it integrates the synthetic minority oversampling technique (SMOTE) algorithm and K-nearest neighbors (kNN), and it designs and constructs an anomaly detection network model based on kNN-SMOTE-LSTM in accordance with the data characteristic of being unbalanced. This model can continuously filter out and securely generate samples to improve the performance of the model through kNN discriminant classifier and avoid the blindness and limitations of the SMOTE algorithm in generating new samples. The experiments demonstrated that the structured kNN-SMOTE-LSTM model can significantly improve the performance of the unbalanced sequence binary classification.
url http://dx.doi.org/10.1155/2020/9084704
work_keys_str_mv AT fuguangbao effectimprovedforhighdimensionalandunbalanceddataanomalydetectionmodelbasedonknnsmotelstm
AT yongqiangwu effectimprovedforhighdimensionalandunbalanceddataanomalydetectionmodelbasedonknnsmotelstm
AT zhaogangli effectimprovedforhighdimensionalandunbalanceddataanomalydetectionmodelbasedonknnsmotelstm
AT yongzhaoli effectimprovedforhighdimensionalandunbalanceddataanomalydetectionmodelbasedonknnsmotelstm
AT lililiu effectimprovedforhighdimensionalandunbalanceddataanomalydetectionmodelbasedonknnsmotelstm
AT guanyuchen effectimprovedforhighdimensionalandunbalanceddataanomalydetectionmodelbasedonknnsmotelstm
_version_ 1715784058436321280