Summary: | The design of hardware-friendly architectures with low computational overhead is desirable for low latency realization of CNN on resource-constrained embedded platforms. In this work, we propose CAxCNN, a Canonic Sign Digit (CSD) based approximation methodology for representing the filter weights of pre-trained CNNs.The proposed CSD representation allows the use of multipliers with reduced computational complexity. The technique can be applied on top of state-of-the-art CNN quantization schemes in a complementary manner. Our experimental results on a variety of CNNs, trained on MNIST, CIFAR-10 and ImageNet datasets, demonstrate that our methodology provides CNN designs with multiple levels of classification accuracy, without requiring any retraining, and while having a low area and computational overhead. Furthermore, when applied in conjunction with a state-of-art quantization scheme, CAxCNN allows the use of multipliers, which offer 77% logic area reduction, as compared to their accurate counterpart, while incurring a drop in Top-1 accuracy of just 5.63% for a VGG-16 network trained on ImageNet.
|