Part-based Pyramid Pooling Feature Fusion in Multi-Scale Supervised Network for Person Re-Identification

碩士 === 國立臺灣科技大學 === 電機工程系 === 107 === Person Re-identification is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in images or video sequence. It is widely believed to be a sub-question for image retrieval, given a camera of pedestrian images,...

Full description

Bibliographic Details
Main Authors: Cing-Han Chou, 周青翰
Other Authors: Shun-Feng Su
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/m3au2d
Description
Summary:碩士 === 國立臺灣科技大學 === 電機工程系 === 107 === Person Re-identification is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in images or video sequence. It is widely believed to be a sub-question for image retrieval, given a camera of pedestrian images, the image of the pedestrian across the device is retrieved. In order to obtain pedestrian features with multi-scale and discriminative characteristics, this study proposes a Part-based Pyramid Pooling Feature Fusion in Multi-Scale Supervised Network (PFMSNet). The multi-scale feature of the pedestrian part is extracted by the pyramid pooling module, and the multi-scale features are concatenated to make the network have a larger receptive field for the classification of body parts. However, the upsampling operation before the concatenate will result in the generation of noise. Therefore, using the SE (Squeeze-and-Excitation) Block network structure, the feature map is weighted by the channel attention, so that the noise and redundant features are filtered out and the important information is retained. Finally, the multi-scale feature independent classification task is added to make the network a bi-branch classification task model, which can further supervise multi-scale features and add more semantic information. The neural network model proposed in this study is trained and tested in the two datasets of Market1501 and DukeMTMC-reID. In the two datasets, Rank-1 reaches 94.7% and 87.7%, and mAP reaches 85.1% and 75.6%.