Visualization of big data security: a case study on the KDD99 cup data set

Cyber security has been thrust into the limelight in the modern technological era because of an array of attacks often bypassing untrained intrusion detection systems (IDSs). Therefore, greater attention has been directed on being able deciphering better methods for identifying attack types to train...

Full description

Bibliographic Details
Main Authors: Zichan Ruan, Yuantian Miao, Lei Pan, Nicholas Patterson, Jun Zhang
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2017-11-01
Series:Digital Communications and Networks
Subjects:
MDS
PCA
Online Access:http://www.sciencedirect.com/science/article/pii/S2352864817300810
id doaj-3b9f02c3628a40949125bfa98ee85a74
record_format Article
spelling doaj-3b9f02c3628a40949125bfa98ee85a742021-02-02T08:17:36ZengKeAi Communications Co., Ltd.Digital Communications and Networks2352-86482017-11-013425025910.1016/j.dcan.2017.07.004Visualization of big data security: a case study on the KDD99 cup data setZichan RuanYuantian MiaoLei PanNicholas PattersonJun ZhangCyber security has been thrust into the limelight in the modern technological era because of an array of attacks often bypassing untrained intrusion detection systems (IDSs). Therefore, greater attention has been directed on being able deciphering better methods for identifying attack types to train IDSs more effectively. Keycyber-attack insights exist in big data; however, an efficient approach is required to determine strong attack types to train IDSs to become more effective in key areas. Despite the rising growth in IDS research, there is a lack of studies involving big data visualization, which is key. The KDD99 data set has served as a strong benchmark since 1999; therefore, we utilized this data set in our experiment. In this study, we utilized hash algorithm, a weight table, and sampling method to deal with the inherent problems caused by analyzing big data; volume, variety, and velocity. By utilizing a visualization algorithm, we were able to gain insights into the KDD99 data set with a clear identification of “normal” clusters and described distinct clusters of effective attacks.http://www.sciencedirect.com/science/article/pii/S2352864817300810Big data visualizationSampling methodMDSPCA
collection DOAJ
language English
format Article
sources DOAJ
author Zichan Ruan
Yuantian Miao
Lei Pan
Nicholas Patterson
Jun Zhang
spellingShingle Zichan Ruan
Yuantian Miao
Lei Pan
Nicholas Patterson
Jun Zhang
Visualization of big data security: a case study on the KDD99 cup data set
Digital Communications and Networks
Big data visualization
Sampling method
MDS
PCA
author_facet Zichan Ruan
Yuantian Miao
Lei Pan
Nicholas Patterson
Jun Zhang
author_sort Zichan Ruan
title Visualization of big data security: a case study on the KDD99 cup data set
title_short Visualization of big data security: a case study on the KDD99 cup data set
title_full Visualization of big data security: a case study on the KDD99 cup data set
title_fullStr Visualization of big data security: a case study on the KDD99 cup data set
title_full_unstemmed Visualization of big data security: a case study on the KDD99 cup data set
title_sort visualization of big data security: a case study on the kdd99 cup data set
publisher KeAi Communications Co., Ltd.
series Digital Communications and Networks
issn 2352-8648
publishDate 2017-11-01
description Cyber security has been thrust into the limelight in the modern technological era because of an array of attacks often bypassing untrained intrusion detection systems (IDSs). Therefore, greater attention has been directed on being able deciphering better methods for identifying attack types to train IDSs more effectively. Keycyber-attack insights exist in big data; however, an efficient approach is required to determine strong attack types to train IDSs to become more effective in key areas. Despite the rising growth in IDS research, there is a lack of studies involving big data visualization, which is key. The KDD99 data set has served as a strong benchmark since 1999; therefore, we utilized this data set in our experiment. In this study, we utilized hash algorithm, a weight table, and sampling method to deal with the inherent problems caused by analyzing big data; volume, variety, and velocity. By utilizing a visualization algorithm, we were able to gain insights into the KDD99 data set with a clear identification of “normal” clusters and described distinct clusters of effective attacks.
topic Big data visualization
Sampling method
MDS
PCA
url http://www.sciencedirect.com/science/article/pii/S2352864817300810
work_keys_str_mv AT zichanruan visualizationofbigdatasecurityacasestudyonthekdd99cupdataset
AT yuantianmiao visualizationofbigdatasecurityacasestudyonthekdd99cupdataset
AT leipan visualizationofbigdatasecurityacasestudyonthekdd99cupdataset
AT nicholaspatterson visualizationofbigdatasecurityacasestudyonthekdd99cupdataset
AT junzhang visualizationofbigdatasecurityacasestudyonthekdd99cupdataset
_version_ 1724297455211642880