Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression
Deep Neural Networks (DNN) are widely applied to many mobile applications demanding real-time implementation and large memory space. Therefore, it presents a new challenge for low-power and efficient implementation of a diversity of applications, such as speech recognition and image classification,...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9253521/ |
id |
doaj-89736c4565be4bff8505636d8f57abea |
---|---|
record_format |
Article |
spelling |
doaj-89736c4565be4bff8505636d8f57abea2021-03-30T04:11:52ZengIEEEIEEE Access2169-35362020-01-01820505120506010.1109/ACCESS.2020.30372549253521Hardware-Based Real-Time Deep Neural Network Lossless Weights CompressionTomer Malach0https://orcid.org/0000-0002-6045-3189Shlomo Greenberg1https://orcid.org/0000-0002-1385-8394Moshe Haiut2https://orcid.org/0000-0002-7028-9888School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Be’er Sheva, IsraelSchool of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Be’er Sheva, IsraelDSP Group Inc., Herzliya, IsraelDeep Neural Networks (DNN) are widely applied to many mobile applications demanding real-time implementation and large memory space. Therefore, it presents a new challenge for low-power and efficient implementation of a diversity of applications, such as speech recognition and image classification, for embedded edge devices. This work presents a hardware-based DNN compression approach to address the limited memory resources in edge devices. We propose a new entropy-based compression algorithm for encoding DNN weights, as well as a real-time decoding method and efficient dedicated hardware implementation. The proposed approach enables a significant reduction of the required DNN weights memory (approximately 70% and 63% for AlexNet and VGG19, respectively), while allowing the decoding of one weight per clock cycle. Results show a high compression ratio compared to well-known lossless compression algorithms. The proposed hardware decoder enables an efficient implementation of large DNN networks in low-power edge devices with limited memory resources.https://ieeexplore.ieee.org/document/9253521/Deep neural networkentropy compressionhardware decoderreal-time |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Tomer Malach Shlomo Greenberg Moshe Haiut |
spellingShingle |
Tomer Malach Shlomo Greenberg Moshe Haiut Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression IEEE Access Deep neural network entropy compression hardware decoder real-time |
author_facet |
Tomer Malach Shlomo Greenberg Moshe Haiut |
author_sort |
Tomer Malach |
title |
Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression |
title_short |
Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression |
title_full |
Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression |
title_fullStr |
Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression |
title_full_unstemmed |
Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression |
title_sort |
hardware-based real-time deep neural network lossless weights compression |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Deep Neural Networks (DNN) are widely applied to many mobile applications demanding real-time implementation and large memory space. Therefore, it presents a new challenge for low-power and efficient implementation of a diversity of applications, such as speech recognition and image classification, for embedded edge devices. This work presents a hardware-based DNN compression approach to address the limited memory resources in edge devices. We propose a new entropy-based compression algorithm for encoding DNN weights, as well as a real-time decoding method and efficient dedicated hardware implementation. The proposed approach enables a significant reduction of the required DNN weights memory (approximately 70% and 63% for AlexNet and VGG19, respectively), while allowing the decoding of one weight per clock cycle. Results show a high compression ratio compared to well-known lossless compression algorithms. The proposed hardware decoder enables an efficient implementation of large DNN networks in low-power edge devices with limited memory resources. |
topic |
Deep neural network entropy compression hardware decoder real-time |
url |
https://ieeexplore.ieee.org/document/9253521/ |
work_keys_str_mv |
AT tomermalach hardwarebasedrealtimedeepneuralnetworklosslessweightscompression AT shlomogreenberg hardwarebasedrealtimedeepneuralnetworklosslessweightscompression AT moshehaiut hardwarebasedrealtimedeepneuralnetworklosslessweightscompression |
_version_ |
1724182211966533632 |