Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems

Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark...

Full description

Bibliographic Details
Main Author:	Lundberg, Jacob
Format:	Others
Language:	English
Published:	Linköpings universitet, Statistik och maskininlärning 2019
Subjects:	machine learning rule fit decision trees embedded systems resources ensemble methods lasso regression optimization Computer Engineering Datorteknik
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013

id	ndltd-UPSALLA1-oai-DiVA.org-liu-162013
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-liu-1620132019-11-20T22:04:27ZResource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systemsengResurseffektiv Representation av MaskininlärningsmodellerLundberg, JacobLinköpings universitet, Statistik och maskininlärning2019machine learningrule fitdecision treesembedded systemsresourcesensemble methodslassoregressionoptimizationComputer EngineeringDatorteknikCombining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	machine learning rule fit decision trees embedded systems resources ensemble methods lasso regression optimization Computer Engineering Datorteknik
spellingShingle	machine learning rule fit decision trees embedded systems resources ensemble methods lasso regression optimization Computer Engineering Datorteknik Lundberg, Jacob Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
description	Combining embedded systems and machine learning models is an exciting prospect. However, to fully target any embedded system, with the most stringent resource requirements, the models have to be designed with care not to overwhelm it. Decision tree ensembles are targeted in this thesis. A benchmark model is created with LightGBM, a popular framework for gradient boosted decision trees. This model is first transformed and regularized with RuleFit, a LASSO regression framework. Then it is further optimized with quantization and weight sharing, techniques used when compressing neural networks. The entire process is combined into a novel framework, called ESRule. The data used comes from the domain of frequency measurements in cellular networks. There is a clear use-case where embedded systems can use the produced resource optimized models. Compared with LightGBM, ESRule uses 72ˆ less internal memory on average, simultaneously increasing predictive performance. The models use 4 kilobytes on average. The serialized variant of ESRule uses 104ˆ less hard disk space than LightGBM. ESRule is also clearly faster at predicting a single sample.
author	Lundberg, Jacob
author_facet	Lundberg, Jacob
author_sort	Lundberg, Jacob
title	Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_short	Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_full	Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_fullStr	Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_full_unstemmed	Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems
title_sort	resource efficient representation of machine learning models : investigating optimization options for decision trees in embedded systems
publisher	Linköpings universitet, Statistik och maskininlärning
publishDate	2019
url	http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013
work_keys_str_mv	AT lundbergjacob resourceefficientrepresentationofmachinelearningmodelsinvestigatingoptimizationoptionsfordecisiontreesinembeddedsystems AT lundbergjacob resurseffektivrepresentationavmaskininlarningsmodeller
_version_	1719293441522794496

Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems

Similar Items