Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine

碩士 === 國立成功大學 === 電腦與通信工程研究所 === 107 === With the increasing popularity of mobile devices and the effectiveness of deep learning-based algorithms, people try to put deep learning models on mobile devices. However, it is limited by the complexity of computational and software overhead. We propose an...

Full description

Bibliographic Details
Main Authors:	Wei-ChungTseng, 曾微中
Other Authors:	Chung-Ho Chen
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/x46nq6

id	ndltd-TW-107NCKU5652009
record_format	oai_dc
spelling	ndltd-TW-107NCKU56520092019-10-25T05:24:18Z http://ndltd.ncl.edu.tw/handle/x46nq6 Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine 深度卷積網路之逐層定點數量化方法與實作YOLOv3推論引擎 Wei-ChungTseng 曾微中碩士國立成功大學電腦與通信工程研究所 107 With the increasing popularity of mobile devices and the effectiveness of deep learning-based algorithms, people try to put deep learning models on mobile devices. However, it is limited by the complexity of computational and software overhead. We propose an efficient framework for inference to fit resource-limited devices with about 1000 times smaller than Tensorflow in code size, and a layer-wised quantization scheme that allows inference computed by fixed-point arithmetic. The fixed-point quantization scheme is more efficient than floating point arithmetic with power consumption reduced to 8% left in cost grained evaluation and reduce model size to 40%~25% left, and keep TOP5 accuracy loss under 1% in Alexnet on ImageNet. Chung-Ho Chen 陳中和 2019 學位論文 ; thesis 70 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立成功大學 === 電腦與通信工程研究所 === 107 === With the increasing popularity of mobile devices and the effectiveness of deep learning-based algorithms, people try to put deep learning models on mobile devices. However, it is limited by the complexity of computational and software overhead. We propose an efficient framework for inference to fit resource-limited devices with about 1000 times smaller than Tensorflow in code size, and a layer-wised quantization scheme that allows inference computed by fixed-point arithmetic. The fixed-point quantization scheme is more efficient than floating point arithmetic with power consumption reduced to 8% left in cost grained evaluation and reduce model size to 40%~25% left, and keep TOP5 accuracy loss under 1% in Alexnet on ImageNet.
author2	Chung-Ho Chen
author_facet	Chung-Ho Chen Wei-ChungTseng 曾微中
author	Wei-ChungTseng 曾微中
spellingShingle	Wei-ChungTseng 曾微中 Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
author_sort	Wei-ChungTseng
title	Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
title_short	Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
title_full	Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
title_fullStr	Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
title_full_unstemmed	Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
title_sort	layer-wise fixed point quantization for deep convolutional neural networks and implementation of yolov3 inference engine
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/x46nq6
work_keys_str_mv	AT weichungtseng layerwisefixedpointquantizationfordeepconvolutionalneuralnetworksandimplementationofyolov3inferenceengine AT céngwēizhōng layerwisefixedpointquantizationfordeepconvolutionalneuralnetworksandimplementationofyolov3inferenceengine AT weichungtseng shēndùjuǎnjīwǎnglùzhīzhúcéngdìngdiǎnshùliànghuàfāngfǎyǔshízuòyolov3tuīlùnyǐnqíng AT céngwēizhōng shēndùjuǎnjīwǎnglùzhīzhúcéngdìngdiǎnshùliànghuàfāngfǎyǔshízuòyolov3tuīlùnyǐnqíng
_version_	1719277972964245504

Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine

Similar Items