Classification of Video Traffic : An Evaluation of Video Traffic Classification using Random Forests and Gradient Boosted Trees

Traffic classification is important for Internet providers and other organizations to solve some critical network management problems.The most common methods for traffic classification is Deep Packet Inspection (DPI) and port based classification. These methods are starting to become obsolete as mor...

Full description

Bibliographic Details
Main Author: Andersson, Ricky
Format: Others
Language:English
Published: Karlstads universitet, Fakulteten för hälsa, natur- och teknikvetenskap (from 2013) 2017
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-55302
Description
Summary:Traffic classification is important for Internet providers and other organizations to solve some critical network management problems.The most common methods for traffic classification is Deep Packet Inspection (DPI) and port based classification. These methods are starting to become obsolete as more and more traffic are being encrypted and applications are starting to use dynamic ports and ports of other popular applications. An alternative method for traffic classification uses Machine Learning (ML).This ML method uses statistical features of network traffic flows, which solves the fundamental problems of DPI and port based classification for encrypted flows.The data used in this study is divided into video and non-video traffic flows and the goal of the study is to create a model which can classify video flows accurately in real-time.Previous studies found tree-based algorithms to work well in classifying network traffic. In this study random forest and gradient boosted trees are examined and compared as they are two of the best performing tree-based classification models.Random forest was found to work the best as the classification speed was significantly faster than gradient boosted trees. Over 93% correctly classified flows were achieved while keeping the random forest model small enough to keep fast classification speeds. === HITS, 4707