Summary: | 碩士 === 國立臺灣大學 === 電子工程學研究所 === 105 === In the past few years, various methods have been proposed to solve the problems of image recognition. The system needs to learn the features of every image correctly while training. In recent years, Convolutional Neural Networks (CNNs) have emerged to provide powerful discriminative capability, especially in the world of image recognition and object detection. Methods that are based on CNN have achieved great success in numerous applications and have been widely used in computer vision. However, their massive computation requirements, being resource-consuming, and memory accesses make them hard to be deployed on mobile or embedded systems. The bandwidth of great deal of parameters that are needed to be computed is a big concern for architecture design. If we would like to design a high frame rate work, it is significant to make the utilization of data to be as large as possible.
In this work, we design an architecture that accelerates the convolutional layers and max-pooling layers of the network for ImageNet large-scale image classification. We first present an computational timing analysis of CNN models and shows that convolutional layers are the most time-consuming. Then the experiments of using stochastic probability and threshold probability methods in data quantization are efficient for convolutional layers are proposed to reduce the bandwidth and improve the utilization of resources. Finally, a state-of-the-art CNN, VGG-16 is implemented in hardware through some architecture design to match the target of high frame rate. The system achieves 13.3 fps with using 15-bit data quantization, under 200 MHz operating frequency, which performs higher frame rate than previous works. The bandwidth of the system is 1.44 GB/s which has better utilization of data than previous approaches.
|