MLID : A multilabelextension of the ID3 algorithm
AbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predict...
Main Authors: | , |
---|---|
Format: | Others |
Language: | English |
Published: |
Blekinge Tekniska Högskola, Institutionen för programvaruteknik
2016
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:bth-13667 |
id |
ndltd-UPSALLA1-oai-DiVA.org-bth-13667 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-bth-136672018-01-14T05:11:51ZMLID : A multilabelextension of the ID3 algorithmengStarefors, HenrikPersson, RasmusBlekinge Tekniska Högskola, Institutionen för programvaruteknikBlekinge Tekniska Högskola, Institutionen för programvaruteknik2016ID3MultilabelMachine learningSoftware EngineeringProgramvaruteknikAbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predictions and decisions. This work will focus on a subcategory of machine learningcalled “MultilabelClassification”, which is the concept of where items introduced to thesystem is categorized by an analytical model, learned through supervised learning, whereeach instance of the dataset can belong to multiple labels, or classes.This paper presents the task of implementing a multilabelclassifier based on the ID3algorithm, which we call MLID (MultilabelIterative Dichotomiser). The solution is presentedboth in a sequentially executed version as well as an parallelized one.We also presents acomparison based on accuracy and execution time, that is performed against algorithms of asimilar nature in order to evaluate the viability of using ID3 as a base to further expand andbuild upon in regards of multi label classification.In order to evaluate the performance of the MLID algorithm, we have measured theexecution time, accuracy, and made a summarization of precision and recall into what iscalled Fmeasure,which is the harmonic mean of both precision and sensitivity of thealgorithm. These results are then compared to already defined and established algorithms,on a range of datasets of varying sizes, in order to assess the viability of the MLID algorithm.The results produced when comparing MLID against other multilabelalgorithms such asBinary relevance, Classifier Chains and Random Trees shows that MLID can compete withother classifiers in term of accuracy and Fmeasure,but in terms of training the algorithm,the time required is proven inferior. Through these results, we can conclude that MLID is aviable option to use as a multilabelclassifier. Although, some constraints inherited from theoriginal ID3 algorithm does impede the full utility of the algorithm, we are certain thatfollowing the same path of development and improvement as ID3 experienced would allowMLID to develop towards a suitable choice of algorithm for a diverse range of multilabelclassification problems. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:bth-13667application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
ID3 Multilabel Machine learning Software Engineering Programvaruteknik |
spellingShingle |
ID3 Multilabel Machine learning Software Engineering Programvaruteknik Starefors, Henrik Persson, Rasmus MLID : A multilabelextension of the ID3 algorithm |
description |
AbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predictions and decisions. This work will focus on a subcategory of machine learningcalled “MultilabelClassification”, which is the concept of where items introduced to thesystem is categorized by an analytical model, learned through supervised learning, whereeach instance of the dataset can belong to multiple labels, or classes.This paper presents the task of implementing a multilabelclassifier based on the ID3algorithm, which we call MLID (MultilabelIterative Dichotomiser). The solution is presentedboth in a sequentially executed version as well as an parallelized one.We also presents acomparison based on accuracy and execution time, that is performed against algorithms of asimilar nature in order to evaluate the viability of using ID3 as a base to further expand andbuild upon in regards of multi label classification.In order to evaluate the performance of the MLID algorithm, we have measured theexecution time, accuracy, and made a summarization of precision and recall into what iscalled Fmeasure,which is the harmonic mean of both precision and sensitivity of thealgorithm. These results are then compared to already defined and established algorithms,on a range of datasets of varying sizes, in order to assess the viability of the MLID algorithm.The results produced when comparing MLID against other multilabelalgorithms such asBinary relevance, Classifier Chains and Random Trees shows that MLID can compete withother classifiers in term of accuracy and Fmeasure,but in terms of training the algorithm,the time required is proven inferior. Through these results, we can conclude that MLID is aviable option to use as a multilabelclassifier. Although, some constraints inherited from theoriginal ID3 algorithm does impede the full utility of the algorithm, we are certain thatfollowing the same path of development and improvement as ID3 experienced would allowMLID to develop towards a suitable choice of algorithm for a diverse range of multilabelclassification problems. |
author |
Starefors, Henrik Persson, Rasmus |
author_facet |
Starefors, Henrik Persson, Rasmus |
author_sort |
Starefors, Henrik |
title |
MLID : A multilabelextension of the ID3 algorithm |
title_short |
MLID : A multilabelextension of the ID3 algorithm |
title_full |
MLID : A multilabelextension of the ID3 algorithm |
title_fullStr |
MLID : A multilabelextension of the ID3 algorithm |
title_full_unstemmed |
MLID : A multilabelextension of the ID3 algorithm |
title_sort |
mlid : a multilabelextension of the id3 algorithm |
publisher |
Blekinge Tekniska Högskola, Institutionen för programvaruteknik |
publishDate |
2016 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:bth-13667 |
work_keys_str_mv |
AT stareforshenrik mlidamultilabelextensionoftheid3algorithm AT perssonrasmus mlidamultilabelextensionoftheid3algorithm |
_version_ |
1718609595452096512 |