Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes.
Nowadays, the ideas of continuous integration and continuous delivery are under heavy usage in order to achieve rapid software development speed and quick product delivery to the customers with good quality. During the process ofmodern software development, the testing stage has always been with gre...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
KTH, Skolan för datavetenskap och kommunikation (CSC)
2016
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191334 |
id |
ndltd-UPSALLA1-oai-DiVA.org-kth-191334 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-kth-1913342016-08-31T05:08:23ZData Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes.engLiu, ChangKTH, Skolan för datavetenskap och kommunikation (CSC)2016Data analysisLog analysisRNNNaive BayesNowadays, the ideas of continuous integration and continuous delivery are under heavy usage in order to achieve rapid software development speed and quick product delivery to the customers with good quality. During the process ofmodern software development, the testing stage has always been with great significance so that the delivered software is meeting all the requirements and with high quality, maintainability, sustainability, scalability, etc. The key assignment of software testing is to find bugs from every test and solve them. The developers and test engineers at Ericsson, who are working on a large scale software architecture, are mainly relying on the logs generated during the testing, which contains important information regarding the system behavior and software status, to debug the software. However, the volume of the data is too big and the variety is too complex and unpredictable, therefore, it is very time consuming and with great efforts for them to manually locate and resolve the bugs from such vast amount of log data. The objective of this thesis project is to explore a way to conduct log analysis efficiently and effectively by applying relevant machine learning algorithms in order to help people quickly detect the test failure and its possible causalities. In this project, a method of preprocessing and clusering original logs is designed and implemented in order to obtain useful data which can be fed to machine learning algorithms. The comparable log analysis, based on two machine learning algorithms - Recurrent Neural Network and Naive Bayes, is conducted for detecting the place of system failures and anomalies. Finally, relevant experimental results are provided and analyzed. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191334application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Data analysis Log analysis RNN Naive Bayes |
spellingShingle |
Data analysis Log analysis RNN Naive Bayes Liu, Chang Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. |
description |
Nowadays, the ideas of continuous integration and continuous delivery are under heavy usage in order to achieve rapid software development speed and quick product delivery to the customers with good quality. During the process ofmodern software development, the testing stage has always been with great significance so that the delivered software is meeting all the requirements and with high quality, maintainability, sustainability, scalability, etc. The key assignment of software testing is to find bugs from every test and solve them. The developers and test engineers at Ericsson, who are working on a large scale software architecture, are mainly relying on the logs generated during the testing, which contains important information regarding the system behavior and software status, to debug the software. However, the volume of the data is too big and the variety is too complex and unpredictable, therefore, it is very time consuming and with great efforts for them to manually locate and resolve the bugs from such vast amount of log data. The objective of this thesis project is to explore a way to conduct log analysis efficiently and effectively by applying relevant machine learning algorithms in order to help people quickly detect the test failure and its possible causalities. In this project, a method of preprocessing and clusering original logs is designed and implemented in order to obtain useful data which can be fed to machine learning algorithms. The comparable log analysis, based on two machine learning algorithms - Recurrent Neural Network and Naive Bayes, is conducted for detecting the place of system failures and anomalies. Finally, relevant experimental results are provided and analyzed. |
author |
Liu, Chang |
author_facet |
Liu, Chang |
author_sort |
Liu, Chang |
title |
Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. |
title_short |
Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. |
title_full |
Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. |
title_fullStr |
Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. |
title_full_unstemmed |
Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. |
title_sort |
data analysis of minimally-structured heterogeneous logs : an experimental study of log template extraction and anomaly detection based on recurrent neural network and naive bayes. |
publisher |
KTH, Skolan för datavetenskap och kommunikation (CSC) |
publishDate |
2016 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191334 |
work_keys_str_mv |
AT liuchang dataanalysisofminimallystructuredheterogeneouslogsanexperimentalstudyoflogtemplateextractionandanomalydetectionbasedonrecurrentneuralnetworkandnaivebayes |
_version_ |
1718381158435127296 |