Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design

碩士 === 國立臺灣大學 === 電子工程學研究所 === 106 === Artificial intelligence (AI) has become the most popular research topic in recent years. AI can be applied to applications on image classification, object detection and natural language processing. Especially, researchers have breakthroughs on such fields with...

Full description

Bibliographic Details
Main Authors:	Tzung-Han Juang, 莊宗翰
Other Authors:	闕志達
Format:	Others
Language:	en_US
Published:	2018
Online Access:	http://ndltd.ncl.edu.tw/handle/sffx7b

Description
Summary:	碩士 === 國立臺灣大學 === 電子工程學研究所 === 106 === Artificial intelligence (AI) has become the most popular research topic in recent years. AI can be applied to applications on image classification, object detection and natural language processing. Especially, researchers have breakthroughs on such fields with neural networks. Neural network is known for its versatile and deep architectures, which can have more than hundreds of layers. Such structure make neural network needs large amount of computation and memory. Improvement of hardware acceleration on graphics processing units (GPU) make neural networks be possible to be applied to practical applications. However, GPU tends to have large volume and is very power hungry. Many researches focused on reducing the resources of computation used in neural network and implementation on specific hardware. Most of these works only support acceleration on inference phase. Other than inference, this thesis proposed architecture that can also support training phase, which is based on backpropagation algorithm to find optimal models of neural networks. Training phase includes forward pass, backward pass and weight update, while inference only contains forward pass. This thesis is devoted to designing a unified architecture that can process these three stages in training phase on convolutional neural networks (CNN). In addition, IO bandwidth is always the bottleneck of accelerator design. To reduce data bandwidth, this thesis uses floating-point signed digit algorithm (FloatSD) and quantization techniques in previous work as basis to reduce neural network size and bit width of data values. The previous work can reach 0.8% loss of top-5 accuracy on ImageNet dataset compared to floating-point version. This thesis designs hardware accelerator for training neural networks, including the designs on data flow for processing, AMBA interface and memory settings. The design is an IP-level engine that can be applied to SOC platform. In addition, this thesis also focuses on optimizing data reusing to make the system have efficient DRAM access. Keyword: Convolutional neural network, Backpropagation, FloatSD

Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design

Similar Items