Deployment failure analysis using machine learning
Manually diagnosing recurrent faults in software systems can be an inefficient use of time for engineers. Manual diagnosis of faults is commonly performed by inspecting system logs during the failure time. The DevOps engineers in Pipedrive, a SaaS business offering a sales CRM platform, have develop...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Uppsala universitet, Institutionen för informationsteknologi
2020
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-420321 |
id |
ndltd-UPSALLA1-oai-DiVA.org-uu-420321 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-uu-4203212020-11-06T05:34:07ZDeployment failure analysis using machine learningengAlviste, Joosep Franz MooritsUppsala universitet, Institutionen för informationsteknologi2020machine learninglog mininglog parsingpipedrivedeployment failure analysisfailure analysisclassificationlog filesComputer SciencesDatavetenskap (datalogi)Manually diagnosing recurrent faults in software systems can be an inefficient use of time for engineers. Manual diagnosis of faults is commonly performed by inspecting system logs during the failure time. The DevOps engineers in Pipedrive, a SaaS business offering a sales CRM platform, have developed a simple regular-expression-based service for automatically classifying failed deployments. However, such a solution is not scalable, and a more sophisticated solution isrequired. In this thesis, log mining was used to automatically diagnose Pipedrive's failed deployments based on the deployment logs. Multiple log parsing and machine learning algorithms were compared based on the resulting log mining pipeline's F1 score. A proof of concept log mining pipeline was created that consisted of log parsing with the Drain algorithm, transforming the log files into event count vectors and finally training a random forest machine learning model to classify the deployment logs. The pipeline gave an F1 score of 0.75 when classifying testing data and a lower score of 0.65 when classifying the evaluation dataset. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-420321application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
machine learning log mining log parsing pipedrive deployment failure analysis failure analysis classification log files Computer Sciences Datavetenskap (datalogi) |
spellingShingle |
machine learning log mining log parsing pipedrive deployment failure analysis failure analysis classification log files Computer Sciences Datavetenskap (datalogi) Alviste, Joosep Franz Moorits Deployment failure analysis using machine learning |
description |
Manually diagnosing recurrent faults in software systems can be an inefficient use of time for engineers. Manual diagnosis of faults is commonly performed by inspecting system logs during the failure time. The DevOps engineers in Pipedrive, a SaaS business offering a sales CRM platform, have developed a simple regular-expression-based service for automatically classifying failed deployments. However, such a solution is not scalable, and a more sophisticated solution isrequired. In this thesis, log mining was used to automatically diagnose Pipedrive's failed deployments based on the deployment logs. Multiple log parsing and machine learning algorithms were compared based on the resulting log mining pipeline's F1 score. A proof of concept log mining pipeline was created that consisted of log parsing with the Drain algorithm, transforming the log files into event count vectors and finally training a random forest machine learning model to classify the deployment logs. The pipeline gave an F1 score of 0.75 when classifying testing data and a lower score of 0.65 when classifying the evaluation dataset. |
author |
Alviste, Joosep Franz Moorits |
author_facet |
Alviste, Joosep Franz Moorits |
author_sort |
Alviste, Joosep Franz Moorits |
title |
Deployment failure analysis using machine learning |
title_short |
Deployment failure analysis using machine learning |
title_full |
Deployment failure analysis using machine learning |
title_fullStr |
Deployment failure analysis using machine learning |
title_full_unstemmed |
Deployment failure analysis using machine learning |
title_sort |
deployment failure analysis using machine learning |
publisher |
Uppsala universitet, Institutionen för informationsteknologi |
publishDate |
2020 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-420321 |
work_keys_str_mv |
AT alvistejoosepfranzmoorits deploymentfailureanalysisusingmachinelearning |
_version_ |
1719355662336524288 |