Summary: | 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 107 === Pedestrian attribute recognition is an important and valuable task in computer vision field attributed to its extensive application, such as person retrieval with attributes, marketing strategy building and person re-identification. However, it is also a challenging task due to various viewpoints, poses, illumination, backgrounds and fine-grained attributes. Although many methods have been proposed in order to deal with these issues, they neglect low image quality issue which often occurred in surveillance camera. Dodge also clarify in their work that image quality will affect machine do classification. To handle this issue, we propose a way to increase more samples and make model to learn how to select useful region in different images in order to combine a new image for more efficient learning. In this way, our model can reduce the influence of low image quality (e.g. noise) and learn the more robust features for more accurate classification. We evaluate on two biggest pedestrian attribute recognition datasets (PA-100K, RAP) through a series of experiments and ablation studies to verify our model can improve the classification accuracy further and showcase the effectiveness of the proposed architecture. Experimental results also demonstrate that our method which add on the common classification networks can outperforms other state-of-the-arts. Furthermore, our method can add on the state-of-the-arts and improve the accuracy further.
|