Layer-wise Fixed Point Quantization for Deep Convolutional Neural Networks and Implementation of YOLOv3 Inference Engine
碩士 === 國立成功大學 === 電腦與通信工程研究所 === 107 === With the increasing popularity of mobile devices and the effectiveness of deep learning-based algorithms, people try to put deep learning models on mobile devices. However, it is limited by the complexity of computational and software overhead. We propose an...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/x46nq6 |
Summary: | 碩士 === 國立成功大學 === 電腦與通信工程研究所 === 107 === With the increasing popularity of mobile devices and the effectiveness of deep learning-based algorithms, people try to put deep learning models on mobile devices. However, it is limited by the complexity of computational and software overhead.
We propose an efficient framework for inference to fit resource-limited devices with about 1000 times smaller than Tensorflow in code size, and a layer-wised quantization scheme that allows inference computed by fixed-point arithmetic. The fixed-point quantization scheme is more efficient than floating point arithmetic with power consumption reduced to 8% left in cost grained evaluation and reduce model size to 40%~25% left, and keep TOP5 accuracy loss under 1% in Alexnet on ImageNet.
|
---|