Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network

碩士 === 國立清華大學 === 資訊工程學系所 === 107 === Object detection is an important research area in the field of computer vision. Its purpose is to find all objects in an image and recognize the class of each object. Since the development of deep learning, an increasing number of studies have ap- plied deep lea...

Full description

Bibliographic Details
Main Authors:	Shih, Kuan-Hung, 施冠宏
Other Authors:	Chiu, Ching-Te
Format:	Others
Language:	en_US
Published:	2018
Online Access:	http://ndltd.ncl.edu.tw/handle/d7d2x3

id	ndltd-TW-107NTHU5392011
record_format	oai_dc
spelling	ndltd-TW-107NTHU53920112019-09-19T03:30:12Z http://ndltd.ncl.edu.tw/handle/d7d2x3 Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network 基於剪枝及多特徵連接輔助區域候選網路之即時物件偵測 Shih, Kuan-Hung 施冠宏碩士國立清華大學資訊工程學系所 107 Object detection is an important research area in the field of computer vision. Its purpose is to find all objects in an image and recognize the class of each object. Since the development of deep learning, an increasing number of studies have ap- plied deep learning in object detection and have achieved successful results. For object detection, there are two types of network architectures: one-stage and two- stage. This study is based on the widely-used two-stage architecture, called Faster R-CNN, and our goal is to improve the inference time to achieve real-time speed without losing accuracy. First, we use pruning to reduce the number of parameters and the amount of computation, which is expected to reduce accuracy as a result. Therefore, we propose a multi-feature assisted region proposal network composed of assisted multi-feature concatenation and a reduced region proposal network to improve accuracy. Assisted multi-feature concatenation combines feature maps from dif- ferent convolutional layers as inputs for a reduced region proposal network. With our proposed method, the network can find regions of interest (ROIs) more accu- rately. Thus, it compensates for loss of accuracy due to pruning. Finally, we use ZF-Net and VGG16 as backbones, and test the network on the PASCAL VOC 2007 dataset. The results show that we can compress ZF-Net from 227 MB to 45 MB and save 66% of computation. We can also compress VGG16 from 523 MB to 144 MB and save 77% of computation. Consequently, the inference speed is 40 FPS for ZF-Net and 27 FPS for VGG16. With the significant compression rates, the accuracies are 60.2% mean average precision (mAP) and 69.1% mAP for ZF-Net and VGG16, respectively. Chiu, Ching-Te 邱瀞德 2018 學位論文 ; thesis 55 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立清華大學 === 資訊工程學系所 === 107 === Object detection is an important research area in the field of computer vision. Its purpose is to find all objects in an image and recognize the class of each object. Since the development of deep learning, an increasing number of studies have ap- plied deep learning in object detection and have achieved successful results. For object detection, there are two types of network architectures: one-stage and two- stage. This study is based on the widely-used two-stage architecture, called Faster R-CNN, and our goal is to improve the inference time to achieve real-time speed without losing accuracy. First, we use pruning to reduce the number of parameters and the amount of computation, which is expected to reduce accuracy as a result. Therefore, we propose a multi-feature assisted region proposal network composed of assisted multi-feature concatenation and a reduced region proposal network to improve accuracy. Assisted multi-feature concatenation combines feature maps from dif- ferent convolutional layers as inputs for a reduced region proposal network. With our proposed method, the network can find regions of interest (ROIs) more accu- rately. Thus, it compensates for loss of accuracy due to pruning. Finally, we use ZF-Net and VGG16 as backbones, and test the network on the PASCAL VOC 2007 dataset. The results show that we can compress ZF-Net from 227 MB to 45 MB and save 66% of computation. We can also compress VGG16 from 523 MB to 144 MB and save 77% of computation. Consequently, the inference speed is 40 FPS for ZF-Net and 27 FPS for VGG16. With the significant compression rates, the accuracies are 60.2% mean average precision (mAP) and 69.1% mAP for ZF-Net and VGG16, respectively.
author2	Chiu, Ching-Te
author_facet	Chiu, Ching-Te Shih, Kuan-Hung 施冠宏
author	Shih, Kuan-Hung 施冠宏
spellingShingle	Shih, Kuan-Hung 施冠宏 Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network
author_sort	Shih, Kuan-Hung
title	Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network
title_short	Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network
title_full	Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network
title_fullStr	Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network
title_full_unstemmed	Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network
title_sort	real-time object detection via pruning and concatenated multi-feature assisted region proposal network
publishDate	2018
url	http://ndltd.ncl.edu.tw/handle/d7d2x3
work_keys_str_mv	AT shihkuanhung realtimeobjectdetectionviapruningandconcatenatedmultifeatureassistedregionproposalnetwork AT shīguānhóng realtimeobjectdetectionviapruningandconcatenatedmultifeatureassistedregionproposalnetwork AT shihkuanhung jīyújiǎnzhījíduōtèzhēngliánjiēfǔzhùqūyùhòuxuǎnwǎnglùzhījíshíwùjiànzhēncè AT shīguānhóng jīyújiǎnzhījíduōtèzhēngliánjiēfǔzhùqūyùhòuxuǎnwǎnglùzhījíshíwùjiànzhēncè
_version_	1719252485681446912

Real-Time Object Detection via Pruning and Concatenated Multi-Feature Assisted Region Proposal Network

Similar Items