Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Linköpings universitet, Statistik och maskininlärning
2019
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013 |
id |
ndltd-UPSALLA1-oai-DiVA.org-liu-162013 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-liu-1620132019-11-20T22:04:27ZResource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systemsengResurseffektiv Representation av MaskininlärningsmodellerLundberg, JacobLinköpings universitet, Statistik och maskininlärning2019machine learningrule fitdecision treesembedded systemsresourcesensemble methodslassoregressionoptimizationComputer EngineeringDatorteknikCombining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
machine learning rule fit decision trees embedded systems resources ensemble methods lasso regression optimization Computer Engineering Datorteknik |
spellingShingle |
machine learning rule fit decision trees embedded systems resources ensemble methods lasso regression optimization Computer Engineering Datorteknik Lundberg, Jacob Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems |
description |
Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample. |
author |
Lundberg, Jacob |
author_facet |
Lundberg, Jacob |
author_sort |
Lundberg, Jacob |
title |
Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems |
title_short |
Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems |
title_full |
Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems |
title_fullStr |
Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems |
title_full_unstemmed |
Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems |
title_sort |
resource efficient representation of machine learning models : investigating optimization options for decision trees in embedded systems |
publisher |
Linköpings universitet, Statistik och maskininlärning |
publishDate |
2019 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013 |
work_keys_str_mv |
AT lundbergjacob resourceefficientrepresentationofmachinelearningmodelsinvestigatingoptimizationoptionsfordecisiontreesinembeddedsystems AT lundbergjacob resurseffektivrepresentationavmaskininlarningsmodeller |
_version_ |
1719293441522794496 |