A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
The increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/14/8/250 |
id |
doaj-27923020597840f695dda15ff2f60c0e |
---|---|
record_format |
Article |
spelling |
doaj-27923020597840f695dda15ff2f60c0e2021-08-26T13:26:27ZengMDPI AGAlgorithms1999-48932021-08-011425025010.3390/a14080250A Real-Time Network Traffic Classifier for Online Applications Using Machine LearningAhmed Abdelmoamen Ahmed0Gbenga Agunsoye1Department of Computer Science, Prairie View A&M University, Prairie View, TX 77446, USADepartment of Computer Science, Prairie View A&M University, Prairie View, TX 77446, USAThe increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send–receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset.https://www.mdpi.com/1999-4893/14/8/250real-timetraffic classifiernetwork flowmachine learningKNNRF |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ahmed Abdelmoamen Ahmed Gbenga Agunsoye |
spellingShingle |
Ahmed Abdelmoamen Ahmed Gbenga Agunsoye A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning Algorithms real-time traffic classifier network flow machine learning KNN RF |
author_facet |
Ahmed Abdelmoamen Ahmed Gbenga Agunsoye |
author_sort |
Ahmed Abdelmoamen Ahmed |
title |
A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning |
title_short |
A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning |
title_full |
A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning |
title_fullStr |
A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning |
title_full_unstemmed |
A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning |
title_sort |
real-time network traffic classifier for online applications using machine learning |
publisher |
MDPI AG |
series |
Algorithms |
issn |
1999-4893 |
publishDate |
2021-08-01 |
description |
The increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send–receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset. |
topic |
real-time traffic classifier network flow machine learning KNN RF |
url |
https://www.mdpi.com/1999-4893/14/8/250 |
work_keys_str_mv |
AT ahmedabdelmoamenahmed arealtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning AT gbengaagunsoye arealtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning AT ahmedabdelmoamenahmed realtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning AT gbengaagunsoye realtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning |
_version_ |
1721195335230947328 |