Exploring Change Point Detection in Network Equipment Logs

Change point detection (CPD) is the method of detecting sudden changes in timeseries, and its importance is great concerning network traffic. With increased knowledge of occurring changes in data logs due to updates in networking equipment,a deeper understanding is allowed for interactions between t...

Full description

Bibliographic Details
Main Author: Björk, Tim
Format: Others
Language:English
Published: Karlstads universitet, Institutionen för matematik och datavetenskap (from 2013) 2021
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-85626
Description
Summary:Change point detection (CPD) is the method of detecting sudden changes in timeseries, and its importance is great concerning network traffic. With increased knowledge of occurring changes in data logs due to updates in networking equipment,a deeper understanding is allowed for interactions between the updates and theoperational resource usage. In a data log that reflects the amount of network traffic, there are large variations in the time series because of reasons such as connectioncount or external changes to the system. To circumvent these unwanted variationchanges and assort the deliberate variation changes is a challenge. In this thesis, we utilize data logs retrieved from a network equipment vendor to detect changes, then compare the detected changes to when firmware/signature updates were applied, configuration changes were made, etc. with the goal to achieve a deeper understanding of any interaction between firmware/signature/configuration changes and operational resource usage. Challenges in the data quality and data processing are addressed through data manipulation to counteract anomalies and unwanted variation, as well as experimentation with parameters to achieve the most ideal settings. Results are produced through experiments to test the accuracy of the various change pointdetection methods, and for investigation of various parameter settings. Through trial and error, a satisfactory configuration is achieved and used in large scale log detection experiments. The results from the experiments conclude that additional information about how changes in variation arises is required to derive the desired understanding.