Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow

The fine-grained classification of encrypted traffic is important for network security analysis. Malicious attacks are usually encrypted and simulated as normal application or content traffic. Supervised machine learning methods are widely used for traffic classification and show good performances....

Full description

Bibliographic Details
Main Authors: Chencheng Ma, Xuehui Du, Lifeng Cao
Format: Article
Language:English
Published: MDPI AG 2020-02-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/9/2/324
id doaj-5b8a729090304396af7c2afd04dce56f
record_format Article
spelling doaj-5b8a729090304396af7c2afd04dce56f2020-11-25T02:26:34ZengMDPI AGElectronics2079-92922020-02-019232410.3390/electronics9020324electronics9020324Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network FlowChencheng Ma0Xuehui Du1Lifeng Cao2National Digital Switching System Engineering and Technological Research Center, Zhengzhou 450000, ChinaNational Digital Switching System Engineering and Technological Research Center, Zhengzhou 450000, ChinaNational Digital Switching System Engineering and Technological Research Center, Zhengzhou 450000, ChinaThe fine-grained classification of encrypted traffic is important for network security analysis. Malicious attacks are usually encrypted and simulated as normal application or content traffic. Supervised machine learning methods are widely used for traffic classification and show good performances. However, they need a large amount of labeled data to train a model, while labeled data is hard to obtain. Aiming at solving this problem, this paper proposes a method to train a model based on the K-nearest neighbor (KNN) algorithm, which only needs a small amount of data. Due to the fact that the importance of different traffic features varies, and traditional KNN does not highlight the importance of different features, this study introduces the concept of feature weight and proposes the weighted feature KNN (WKNN) algorithm. Furthermore, to obtain the optimal feature set and the corresponding feature weight set, a feature selection and feature weight self-adaptive algorithm for WKNN is proposed. In addition, a three-layer classification framework for encrypted network flows is established. Based on the improved KNN and the framework, this study finally presents a method for fine-grained classification of encrypted network flows, which can identify the encryption status, application type and content type of encrypted network flows with high accuracies of 99.3%, 92.4%, and 97.0%, respectively.https://www.mdpi.com/2079-9292/9/2/324encrypted network flow classificationk-nearest neighbor algorithmfeature selection and weightedfine-grained analysissmall training set
collection DOAJ
language English
format Article
sources DOAJ
author Chencheng Ma
Xuehui Du
Lifeng Cao
spellingShingle Chencheng Ma
Xuehui Du
Lifeng Cao
Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
Electronics
encrypted network flow classification
k-nearest neighbor algorithm
feature selection and weighted
fine-grained analysis
small training set
author_facet Chencheng Ma
Xuehui Du
Lifeng Cao
author_sort Chencheng Ma
title Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
title_short Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
title_full Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
title_fullStr Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
title_full_unstemmed Improved KNN Algorithm for Fine-Grained Classification of Encrypted Network Flow
title_sort improved knn algorithm for fine-grained classification of encrypted network flow
publisher MDPI AG
series Electronics
issn 2079-9292
publishDate 2020-02-01
description The fine-grained classification of encrypted traffic is important for network security analysis. Malicious attacks are usually encrypted and simulated as normal application or content traffic. Supervised machine learning methods are widely used for traffic classification and show good performances. However, they need a large amount of labeled data to train a model, while labeled data is hard to obtain. Aiming at solving this problem, this paper proposes a method to train a model based on the K-nearest neighbor (KNN) algorithm, which only needs a small amount of data. Due to the fact that the importance of different traffic features varies, and traditional KNN does not highlight the importance of different features, this study introduces the concept of feature weight and proposes the weighted feature KNN (WKNN) algorithm. Furthermore, to obtain the optimal feature set and the corresponding feature weight set, a feature selection and feature weight self-adaptive algorithm for WKNN is proposed. In addition, a three-layer classification framework for encrypted network flows is established. Based on the improved KNN and the framework, this study finally presents a method for fine-grained classification of encrypted network flows, which can identify the encryption status, application type and content type of encrypted network flows with high accuracies of 99.3%, 92.4%, and 97.0%, respectively.
topic encrypted network flow classification
k-nearest neighbor algorithm
feature selection and weighted
fine-grained analysis
small training set
url https://www.mdpi.com/2079-9292/9/2/324
work_keys_str_mv AT chenchengma improvedknnalgorithmforfinegrainedclassificationofencryptednetworkflow
AT xuehuidu improvedknnalgorithmforfinegrainedclassificationofencryptednetworkflow
AT lifengcao improvedknnalgorithmforfinegrainedclassificationofencryptednetworkflow
_version_ 1724846202232504320