Signed-Power-of-Two (SPT) Neuron Design and Synapse Weight Approximation for Embedded Neural Networks

碩士 === 國立中正大學 === 電機工程研究所 === 106 === Lowering the computation complexity is essential for deep neural networks (DNN) to be integrated into cost- and power-sensitive embedded systems. This thesis first proposes a design-time normalization technique to convert a floating-point DNN model into a fixed...

Full description

Bibliographic Details
Main Authors: LIN, CHI-YOU, 林祺祐
Other Authors: YEH, CHING-WEI
Format: Others
Language:zh-TW
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/d48y57
Description
Summary:碩士 === 國立中正大學 === 電機工程研究所 === 106 === Lowering the computation complexity is essential for deep neural networks (DNN) to be integrated into cost- and power-sensitive embedded systems. This thesis first proposes a design-time normalization technique to convert a floating-point DNN model into a fixed-point one, of which the concept is derived from shared exponents in “block floating-point.” Tedious tradeoffs between integer and fractional bits in traditional floating-point to fixed-point conversion are thus eliminated. In addition, fine-grained per-neuron normalization is proposed to further reduce 1~2-bit effective wordlength. Then, this thesis proposes synapse weight approximation with limited signed-power-of-two (SPT) terms, in order to replace the extensive multiplications in conventional DNN with shifts and additions. In our implementations in 28nm CMOS, the maximum performance and silicon area of a neuron has been improved by 36% and 18% simultaneously, and the silicon area can be improved by ~39% for identical performance. Finally, the vector quantization technique in image compression has been applied to select several outstanding weights, where all synapse weights are replaced by these outstanding weights. In our simulations with the MNIST dataset, only 4~16 outstanding weights can achieve accuracy similar to 16-bit full-set weights. In other words, the weight storage can be reduced by 4X to 8X effectively.