A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning

The increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses...

Full description

Bibliographic Details
Main Authors: Ahmed Abdelmoamen Ahmed, Gbenga Agunsoye
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Algorithms
Subjects:
KNN
RF
Online Access:https://www.mdpi.com/1999-4893/14/8/250
id doaj-27923020597840f695dda15ff2f60c0e
record_format Article
spelling doaj-27923020597840f695dda15ff2f60c0e2021-08-26T13:26:27ZengMDPI AGAlgorithms1999-48932021-08-011425025010.3390/a14080250A Real-Time Network Traffic Classifier for Online Applications Using Machine LearningAhmed Abdelmoamen Ahmed0Gbenga Agunsoye1Department of Computer Science, Prairie View A&M University, Prairie View, TX 77446, USADepartment of Computer Science, Prairie View A&M University, Prairie View, TX 77446, USAThe increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send–receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset.https://www.mdpi.com/1999-4893/14/8/250real-timetraffic classifiernetwork flowmachine learningKNNRF
collection DOAJ
language English
format Article
sources DOAJ
author Ahmed Abdelmoamen Ahmed
Gbenga Agunsoye
spellingShingle Ahmed Abdelmoamen Ahmed
Gbenga Agunsoye
A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
Algorithms
real-time
traffic classifier
network flow
machine learning
KNN
RF
author_facet Ahmed Abdelmoamen Ahmed
Gbenga Agunsoye
author_sort Ahmed Abdelmoamen Ahmed
title A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
title_short A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
title_full A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
title_fullStr A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
title_full_unstemmed A Real-Time Network Traffic Classifier for Online Applications Using Machine Learning
title_sort real-time network traffic classifier for online applications using machine learning
publisher MDPI AG
series Algorithms
issn 1999-4893
publishDate 2021-08-01
description The increasing ubiquity of network traffic and the new online applications’ deployment has increased traffic analysis complexity. Traditionally, network administrators rely on recognizing well-known static ports for classifying the traffic flowing their networks. However, modern network traffic uses dynamic ports and is transported over secure application-layer protocols (e.g., HTTPS, SSL, and SSH). This makes it a challenging task for network administrators to identify online applications using traditional port-based approaches. One way for classifying the modern network traffic is to use machine learning (ML) to distinguish between the different traffic attributes such as packet count and size, packet inter-arrival time, packet send–receive ratio, etc. This paper presents the design and implementation of NetScrapper, a flow-based network traffic classifier for online applications. NetScrapper uses three ML models, namely K-Nearest Neighbors (KNN), Random Forest (RF), and Artificial Neural Network (ANN), for classifying the most popular 53 online applications, including Amazon, Youtube, Google, Twitter, and many others. We collected a network traffic dataset containing 3,577,296 packet flows with different 87 features for training, validating, and testing the ML models. A web-based user-friendly interface is developed to enable users to either upload a snapshot of their network traffic to NetScrapper or sniff the network traffic directly from the network interface card in real time. Additionally, we created a middleware pipeline for interfacing the three models with the Flask GUI. Finally, we evaluated NetScrapper using various performance metrics such as classification accuracy and prediction time. Most notably, we found that our ANN model achieves an overall classification accuracy of 99.86% in recognizing the online applications in our dataset.
topic real-time
traffic classifier
network flow
machine learning
KNN
RF
url https://www.mdpi.com/1999-4893/14/8/250
work_keys_str_mv AT ahmedabdelmoamenahmed arealtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning
AT gbengaagunsoye arealtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning
AT ahmedabdelmoamenahmed realtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning
AT gbengaagunsoye realtimenetworktrafficclassifierforonlineapplicationsusingmachinelearning
_version_ 1721195335230947328