Scalable Object Detection by Filter Compression with Regularized Sparse Coding

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object...

Full description

Bibliographic Details
Main Authors: Ting-Hsuan Chao, 趙廷軒
Other Authors: Winston Hsu
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/58394139843926118307
id ndltd-TW-103NTU05392032
record_format oai_dc
spelling ndltd-TW-103NTU053920322016-11-19T04:09:45Z http://ndltd.ncl.edu.tw/handle/58394139843926118307 Scalable Object Detection by Filter Compression with Regularized Sparse Coding 大規模物件偵測利用正規化稀疏編碼 Ting-Hsuan Chao 趙廷軒 碩士 國立臺灣大學 資訊工程學研究所 103 For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object detection. However, these methods have linear computational complexity in regard to the number of classes and may lead to huge computing time. To solve the problem, some works learn a codebook for the filters and conduct operations only on the codebook to make computational complexity sublinear in regard to the number of classes. But the past studies missed to consider filter characteristics, e.g., filters are weights trained by Support Vector Machine, and rather they applied method such as sparse coding for visual signals'' optimization. This misuse results in huge accuracy loss when a large speedup is required. To remedy this shortcoming, we have developed a new method called Regularized Sparse Coding which is designed to reconstruct filter functionality. That is, it reconstructs the ability of filter to produce accurate score for classification. Our method can reconstruct filters by minimizing score map error, while sparse coding reconstructs filters by minimizing appearance error. This different optimization strategy makes our method be able to have small accuracy loss when a large speedup is achieved. On the ILSVRC 2013 dataset, which has 200 classes, this work represents a 16 times speedup using only 1.25% memory on single CPU with 0.04 mAP drop when compared with the original Deformable Part Model. Moreover, parallel computing on GPUs is also applicable for our work to achieve more speedup. Winston Hsu 徐宏民 2015 學位論文 ; thesis 21 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 103 === For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object detection. However, these methods have linear computational complexity in regard to the number of classes and may lead to huge computing time. To solve the problem, some works learn a codebook for the filters and conduct operations only on the codebook to make computational complexity sublinear in regard to the number of classes. But the past studies missed to consider filter characteristics, e.g., filters are weights trained by Support Vector Machine, and rather they applied method such as sparse coding for visual signals'' optimization. This misuse results in huge accuracy loss when a large speedup is required. To remedy this shortcoming, we have developed a new method called Regularized Sparse Coding which is designed to reconstruct filter functionality. That is, it reconstructs the ability of filter to produce accurate score for classification. Our method can reconstruct filters by minimizing score map error, while sparse coding reconstructs filters by minimizing appearance error. This different optimization strategy makes our method be able to have small accuracy loss when a large speedup is achieved. On the ILSVRC 2013 dataset, which has 200 classes, this work represents a 16 times speedup using only 1.25% memory on single CPU with 0.04 mAP drop when compared with the original Deformable Part Model. Moreover, parallel computing on GPUs is also applicable for our work to achieve more speedup.
author2 Winston Hsu
author_facet Winston Hsu
Ting-Hsuan Chao
趙廷軒
author Ting-Hsuan Chao
趙廷軒
spellingShingle Ting-Hsuan Chao
趙廷軒
Scalable Object Detection by Filter Compression with Regularized Sparse Coding
author_sort Ting-Hsuan Chao
title Scalable Object Detection by Filter Compression with Regularized Sparse Coding
title_short Scalable Object Detection by Filter Compression with Regularized Sparse Coding
title_full Scalable Object Detection by Filter Compression with Regularized Sparse Coding
title_fullStr Scalable Object Detection by Filter Compression with Regularized Sparse Coding
title_full_unstemmed Scalable Object Detection by Filter Compression with Regularized Sparse Coding
title_sort scalable object detection by filter compression with regularized sparse coding
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/58394139843926118307
work_keys_str_mv AT tinghsuanchao scalableobjectdetectionbyfiltercompressionwithregularizedsparsecoding
AT zhàotíngxuān scalableobjectdetectionbyfiltercompressionwithregularizedsparsecoding
AT tinghsuanchao dàguīmówùjiànzhēncèlìyòngzhèngguīhuàxīshūbiānmǎ
AT zhàotíngxuān dàguīmówùjiànzhēncèlìyòngzhèngguīhuàxīshūbiānmǎ
_version_ 1718394336259866624