P2P Flow Identification by Ensemble Classification

碩士 === 國立臺灣科技大學 === 電子工程系 === 97 === Peer-to-peer (P2P) traffic has accounted for major fraction of all internet traffic. Hence, P2P flow identification becomes an important problem for network management. A robust P2P flow identification approach should operate properly without port information and...

Full description

Bibliographic Details
Main Authors: Cheng-Ru Wu, 吳承儒
Other Authors: Yie-Tarng Chen
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/21843436536868745208
Description
Summary:碩士 === 國立臺灣科技大學 === 電子工程系 === 97 === Peer-to-peer (P2P) traffic has accounted for major fraction of all internet traffic. Hence, P2P flow identification becomes an important problem for network management. A robust P2P flow identification approach should operate properly without port information and payload information, since new-generation P2P applications can use arbitrary port number to avoid fixed-port block and use payload encryption to avoid P2P signature detection. Previous research that use machine learning approach for P2P flow identification, suffer form low detection rate and high false positive rate due to lack for proper features. In our research, we propose an ensemble classification approach, which integrates Hidden Markov Model (HMM) and Adaboost algorithm. The proposed P2P identification scheme can be divided into two stages. In the first stage, we investigated the phenomenon of small packet and large packet interchange in the P2P flow and identified an important feature, called packet size sequence pattern, and use Hidden Markov Model (HMM) to recognize the patterns. In the second stage, we use Adaboost algorithm with traditional flow attributes to promote the detection accuracy and reduce false positive in classification. To verify the performance of the proposed P2P identification based on ensemble classification, we collect network traffic traces from NTUST campus, and run intensive simulations. The simulation results show that the ensemble classification approach for P2P flow identification can achieve 98% detection rate and 5% false alarm rate.