Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design
碩士 === 國立臺灣大學 === 電子工程學研究所 === 106 === Artificial intelligence (AI) has become the most popular research topic in recent years. AI can be applied to applications on image classification, object detection and natural language processing. Especially, researchers have breakthroughs on such fields with...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2018
|
Online Access: | http://ndltd.ncl.edu.tw/handle/sffx7b |
id |
ndltd-TW-106NTU05428046 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106NTU054280462019-07-25T04:46:48Z http://ndltd.ncl.edu.tw/handle/sffx7b Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design 高效能神經網路訓練加速器架構與其電路設計 Tzung-Han Juang 莊宗翰 碩士 國立臺灣大學 電子工程學研究所 106 Artificial intelligence (AI) has become the most popular research topic in recent years. AI can be applied to applications on image classification, object detection and natural language processing. Especially, researchers have breakthroughs on such fields with neural networks. Neural network is known for its versatile and deep architectures, which can have more than hundreds of layers. Such structure make neural network needs large amount of computation and memory. Improvement of hardware acceleration on graphics processing units (GPU) make neural networks be possible to be applied to practical applications. However, GPU tends to have large volume and is very power hungry. Many researches focused on reducing the resources of computation used in neural network and implementation on specific hardware. Most of these works only support acceleration on inference phase. Other than inference, this thesis proposed architecture that can also support training phase, which is based on backpropagation algorithm to find optimal models of neural networks. Training phase includes forward pass, backward pass and weight update, while inference only contains forward pass. This thesis is devoted to designing a unified architecture that can process these three stages in training phase on convolutional neural networks (CNN). In addition, IO bandwidth is always the bottleneck of accelerator design. To reduce data bandwidth, this thesis uses floating-point signed digit algorithm (FloatSD) and quantization techniques in previous work as basis to reduce neural network size and bit width of data values. The previous work can reach 0.8% loss of top-5 accuracy on ImageNet dataset compared to floating-point version. This thesis designs hardware accelerator for training neural networks, including the designs on data flow for processing, AMBA interface and memory settings. The design is an IP-level engine that can be applied to SOC platform. In addition, this thesis also focuses on optimizing data reusing to make the system have efficient DRAM access. Keyword: Convolutional neural network, Backpropagation, FloatSD 闕志達 2018 學位論文 ; thesis 127 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 電子工程學研究所 === 106 === Artificial intelligence (AI) has become the most popular research topic in recent years. AI can be applied to applications on image classification, object detection and natural language processing. Especially, researchers have breakthroughs on such fields with neural networks. Neural network is known for its versatile and deep architectures, which can have more than hundreds of layers. Such structure make neural network needs large amount of computation and memory.
Improvement of hardware acceleration on graphics processing units (GPU) make neural networks be possible to be applied to practical applications. However, GPU tends to have large volume and is very power hungry. Many researches focused on reducing the resources of computation used in neural network and implementation on specific hardware. Most of these works only support acceleration on inference phase.
Other than inference, this thesis proposed architecture that can also support training phase, which is based on backpropagation algorithm to find optimal models of neural networks. Training phase includes forward pass, backward pass and weight update, while inference only contains forward pass. This thesis is devoted to designing a unified architecture that can process these three stages in training phase on convolutional neural networks (CNN).
In addition, IO bandwidth is always the bottleneck of accelerator design. To reduce data bandwidth, this thesis uses floating-point signed digit algorithm (FloatSD) and quantization techniques in previous work as basis to reduce neural network size and bit width of data values. The previous work can reach 0.8% loss of top-5 accuracy on ImageNet dataset compared to floating-point version.
This thesis designs hardware accelerator for training neural networks, including the designs on data flow for processing, AMBA interface and memory settings. The design is an IP-level engine that can be applied to SOC platform. In addition, this thesis also focuses on optimizing data reusing to make the system have efficient DRAM access.
Keyword: Convolutional neural network, Backpropagation, FloatSD
|
author2 |
闕志達 |
author_facet |
闕志達 Tzung-Han Juang 莊宗翰 |
author |
Tzung-Han Juang 莊宗翰 |
spellingShingle |
Tzung-Han Juang 莊宗翰 Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design |
author_sort |
Tzung-Han Juang |
title |
Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design |
title_short |
Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design |
title_full |
Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design |
title_fullStr |
Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design |
title_full_unstemmed |
Energy-Efficient Accelerator Architecture for Neural Network Training and Its Circuit Design |
title_sort |
energy-efficient accelerator architecture for neural network training and its circuit design |
publishDate |
2018 |
url |
http://ndltd.ncl.edu.tw/handle/sffx7b |
work_keys_str_mv |
AT tzunghanjuang energyefficientacceleratorarchitectureforneuralnetworktraininganditscircuitdesign AT zhuāngzōnghàn energyefficientacceleratorarchitectureforneuralnetworktraininganditscircuitdesign AT tzunghanjuang gāoxiàonéngshénjīngwǎnglùxùnliànjiāsùqìjiàgòuyǔqídiànlùshèjì AT zhuāngzōnghàn gāoxiàonéngshénjīngwǎnglùxùnliànjiāsùqìjiàgòuyǔqídiànlùshèjì |
_version_ |
1719230021297504256 |