MLID : A multilabelextension of the ID3 algorithm

AbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predict...

Full description

Bibliographic Details
Main Authors: Starefors, Henrik, Persson, Rasmus
Format: Others
Language:English
Published: Blekinge Tekniska Högskola, Institutionen för programvaruteknik 2016
Subjects:
ID3
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:bth-13667
id ndltd-UPSALLA1-oai-DiVA.org-bth-13667
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-bth-136672018-01-14T05:11:51ZMLID : A multilabelextension of the ID3 algorithmengStarefors, HenrikPersson, RasmusBlekinge Tekniska Högskola, Institutionen för programvaruteknikBlekinge Tekniska Högskola, Institutionen för programvaruteknik2016ID3MultilabelMachine learningSoftware EngineeringProgramvaruteknikAbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predictions and decisions. This work will focus on a subcategory of machine learningcalled “MultilabelClassification”, which is the concept of where items introduced to thesystem is categorized by an analytical model, learned through supervised learning, whereeach instance of the dataset can belong to multiple labels, or classes.This paper presents the task of implementing a multilabelclassifier based on the ID3algorithm, which we call MLID (MultilabelIterative Dichotomiser). The solution is presentedboth in a sequentially executed version as well as an parallelized one.We also presents acomparison based on accuracy and execution time, that is performed against algorithms of asimilar nature in order to evaluate the viability of using ID3 as a base to further expand andbuild upon in regards of multi label classification.In order to evaluate the performance of the MLID algorithm, we have measured theexecution time, accuracy, and made a summarization of precision and recall into what iscalled Fmeasure,which is the harmonic mean of both precision and sensitivity of thealgorithm. These results are then compared to already defined and established algorithms,on a range of datasets of varying sizes, in order to assess the viability of the MLID algorithm.The results produced when comparing MLID against other multilabelalgorithms such asBinary relevance, Classifier Chains and Random Trees shows that MLID can compete withother classifiers in term of accuracy and Fmeasure,but in terms of training the algorithm,the time required is proven inferior. Through these results, we can conclude that MLID is aviable option to use as a multilabelclassifier. Although, some constraints inherited from theoriginal ID3 algorithm does impede the full utility of the algorithm, we are certain thatfollowing the same path of development and improvement as ID3 experienced would allowMLID to develop towards a suitable choice of algorithm for a diverse range of multilabelclassification problems. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:bth-13667application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic ID3
Multilabel
Machine learning
Software Engineering
Programvaruteknik
spellingShingle ID3
Multilabel
Machine learning
Software Engineering
Programvaruteknik
Starefors, Henrik
Persson, Rasmus
MLID : A multilabelextension of the ID3 algorithm
description AbstractMachine learning is a subfield within artificial intelligence that revolves around constructingalgorithms that can learn from, and make predictions on data. Instead of following strict andstatic instruction, the system operates by adapting and learning from input data in order tomake predictions and decisions. This work will focus on a subcategory of machine learningcalled “MultilabelClassification”, which is the concept of where items introduced to thesystem is categorized by an analytical model, learned through supervised learning, whereeach instance of the dataset can belong to multiple labels, or classes.This paper presents the task of implementing a multilabelclassifier based on the ID3algorithm, which we call MLID (MultilabelIterative Dichotomiser). The solution is presentedboth in a sequentially executed version as well as an parallelized one.We also presents acomparison based on accuracy and execution time, that is performed against algorithms of asimilar nature in order to evaluate the viability of using ID3 as a base to further expand andbuild upon in regards of multi label classification.In order to evaluate the performance of the MLID algorithm, we have measured theexecution time, accuracy, and made a summarization of precision and recall into what iscalled Fmeasure,which is the harmonic mean of both precision and sensitivity of thealgorithm. These results are then compared to already defined and established algorithms,on a range of datasets of varying sizes, in order to assess the viability of the MLID algorithm.The results produced when comparing MLID against other multilabelalgorithms such asBinary relevance, Classifier Chains and Random Trees shows that MLID can compete withother classifiers in term of accuracy and Fmeasure,but in terms of training the algorithm,the time required is proven inferior. Through these results, we can conclude that MLID is aviable option to use as a multilabelclassifier. Although, some constraints inherited from theoriginal ID3 algorithm does impede the full utility of the algorithm, we are certain thatfollowing the same path of development and improvement as ID3 experienced would allowMLID to develop towards a suitable choice of algorithm for a diverse range of multilabelclassification problems.
author Starefors, Henrik
Persson, Rasmus
author_facet Starefors, Henrik
Persson, Rasmus
author_sort Starefors, Henrik
title MLID : A multilabelextension of the ID3 algorithm
title_short MLID : A multilabelextension of the ID3 algorithm
title_full MLID : A multilabelextension of the ID3 algorithm
title_fullStr MLID : A multilabelextension of the ID3 algorithm
title_full_unstemmed MLID : A multilabelextension of the ID3 algorithm
title_sort mlid : a multilabelextension of the id3 algorithm
publisher Blekinge Tekniska Högskola, Institutionen för programvaruteknik
publishDate 2016
url http://urn.kb.se/resolve?urn=urn:nbn:se:bth-13667
work_keys_str_mv AT stareforshenrik mlidamultilabelextensionoftheid3algorithm
AT perssonrasmus mlidamultilabelextensionoftheid3algorithm
_version_ 1718609595452096512