Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems

Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark...

Full description

Bibliographic Details
Main Author: Lundberg, Jacob
Format: Others
Language:English
Published: Linköpings universitet, Statistik och maskininlärning 2019
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013
id ndltd-UPSALLA1-oai-DiVA.org-liu-162013
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-liu-1620132019-11-20T22:04:27ZResource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systemsengResurseffektiv Representation av MaskininlärningsmodellerLundberg, JacobLinköpings universitet, Statistik och maskininlärning2019machine learningrule fitdecision treesembedded systemsresourcesensemble methodslassoregressionoptimizationComputer EngineeringDatorteknikCombining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic machine learning
rule fit
decision trees
embedded systems
resources
ensemble methods
lasso
regression
optimization
Computer Engineering
Datorteknik
spellingShingle machine learning
rule fit
decision trees
embedded systems
resources
ensemble methods
lasso
regression
optimization
Computer Engineering
Datorteknik
Lundberg, Jacob
Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
description Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample.
author Lundberg, Jacob
author_facet Lundberg, Jacob
author_sort Lundberg, Jacob
title Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_short Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_full Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_fullStr Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_full_unstemmed Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_sort resource efficient representation of machine learning models : investigating optimization options for decision trees in embedded systems
publisher Linköpings universitet, Statistik och maskininlärning
publishDate 2019
url http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013
work_keys_str_mv AT lundbergjacob resourceefficientrepresentationofmachinelearningmodelsinvestigatingoptimizationoptionsfordecisiontreesinembeddedsystems
AT lundbergjacob resurseffektivrepresentationavmaskininlarningsmodeller
_version_ 1719293441522794496