Scalable Object Detection by Filter Compression with Regularized Sparse Coding
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/58394139843926118307 |
id |
ndltd-TW-103NTU05392032 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103NTU053920322016-11-19T04:09:45Z http://ndltd.ncl.edu.tw/handle/58394139843926118307 Scalable Object Detection by Filter Compression with Regularized Sparse Coding 大規模物件偵測利用正規化稀疏編碼 Ting-Hsuan Chao 趙廷軒 碩士 國立臺灣大學 資訊工程學研究所 103 For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object detection. However, these methods have linear computational complexity in regard to the number of classes and may lead to huge computing time. To solve the problem, some works learn a codebook for the filters and conduct operations only on the codebook to make computational complexity sublinear in regard to the number of classes. But the past studies missed to consider filter characteristics, e.g., filters are weights trained by Support Vector Machine, and rather they applied method such as sparse coding for visual signals'' optimization. This misuse results in huge accuracy loss when a large speedup is required. To remedy this shortcoming, we have developed a new method called Regularized Sparse Coding which is designed to reconstruct filter functionality. That is, it reconstructs the ability of filter to produce accurate score for classification. Our method can reconstruct filters by minimizing score map error, while sparse coding reconstructs filters by minimizing appearance error. This different optimization strategy makes our method be able to have small accuracy loss when a large speedup is achieved. On the ILSVRC 2013 dataset, which has 200 classes, this work represents a 16 times speedup using only 1.25% memory on single CPU with 0.04 mAP drop when compared with the original Deformable Part Model. Moreover, parallel computing on GPUs is also applicable for our work to achieve more speedup. Winston Hsu 徐宏民 2015 學位論文 ; thesis 21 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object detection. However, these methods have linear computational complexity in regard to the number of classes and may lead to huge computing time. To solve the problem, some works learn a codebook for the filters and conduct operations only on the codebook to make computational complexity sublinear in regard to the number of classes. But the past studies missed to consider filter characteristics, e.g., filters are weights trained by Support Vector Machine, and rather they applied method such as sparse coding for visual signals'' optimization. This misuse results in huge accuracy loss when a large speedup is required. To remedy this shortcoming, we have developed a new method called Regularized Sparse Coding which is designed to reconstruct filter functionality. That is, it reconstructs the ability of filter to produce accurate score for classification. Our method can reconstruct filters by minimizing score map error, while sparse coding reconstructs filters by minimizing appearance error. This different optimization strategy makes our method be able to have small accuracy loss when a large speedup is achieved. On the ILSVRC 2013 dataset, which has 200 classes, this work represents a 16 times speedup using only 1.25% memory on single CPU with 0.04 mAP drop when compared with the original Deformable Part Model. Moreover, parallel computing on GPUs is also applicable for our work to achieve more speedup.
|
author2 |
Winston Hsu |
author_facet |
Winston Hsu Ting-Hsuan Chao 趙廷軒 |
author |
Ting-Hsuan Chao 趙廷軒 |
spellingShingle |
Ting-Hsuan Chao 趙廷軒 Scalable Object Detection by Filter Compression with Regularized Sparse Coding |
author_sort |
Ting-Hsuan Chao |
title |
Scalable Object Detection by Filter Compression with Regularized Sparse Coding |
title_short |
Scalable Object Detection by Filter Compression with Regularized Sparse Coding |
title_full |
Scalable Object Detection by Filter Compression with Regularized Sparse Coding |
title_fullStr |
Scalable Object Detection by Filter Compression with Regularized Sparse Coding |
title_full_unstemmed |
Scalable Object Detection by Filter Compression with Regularized Sparse Coding |
title_sort |
scalable object detection by filter compression with regularized sparse coding |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/58394139843926118307 |
work_keys_str_mv |
AT tinghsuanchao scalableobjectdetectionbyfiltercompressionwithregularizedsparsecoding AT zhàotíngxuān scalableobjectdetectionbyfiltercompressionwithregularizedsparsecoding AT tinghsuanchao dàguīmówùjiànzhēncèlìyòngzhèngguīhuàxīshūbiānmǎ AT zhàotíngxuān dàguīmówùjiànzhēncèlìyòngzhèngguīhuàxīshūbiānmǎ |
_version_ |
1718394336259866624 |