Summary: | 碩士 === 國立臺灣科技大學 === 電子工程系 === 103 === Network packet tracing has been used for many different purposes, such as protocol analysis, networking performance analysis, network software debugging, and so on. Due to the big data problem, it is difficult to analyze the packet traces efficiently and promptly when doing the research. In this paper, we proposed an Hadoop based Internet traffic extractor to solve big data problem. An Internet traffic extractor can extract bi-directional flows containing the interested traffic, reorder the packets in each flow, and then put the reordered packets of a flow into a folder for further analysis.
MapReduce is the core technology of Hadoop [1] for processing big data. In order to design an Internet traffic extractor, we solved the design issues of an Internet traffic extractor by considering the properties of MapReduce. In our experiments, we verified that the extractor can correctly extract the specific flows. Furthermore, the extractor is capable of dealing with big data with the scalability of Hadoop cluster.
|