Person Re-Identification Based on Two-Stream Network With Attention and Pose Features

Due to posture, blurring, occlusion, and other problems, person re-identification(Re-ID) remains a challenging task at present. In this paper, we combine the advantages of pose estimation and attention mechanism to better solve these problems with better performance, which combines pose and attentio...

Full description

Bibliographic Details
Main Authors:	Xiaowei Gong, Suguo Zhu
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Attention pose estimation person re-identification two-stream
Online Access:	https://ieeexplore.ieee.org/document/8795487/

id	doaj-b4dfba4b039a49b389363482e0c795de
record_format	Article
spelling	doaj-b4dfba4b039a49b389363482e0c795de2021-04-05T17:17:04ZengIEEEIEEE Access2169-35362019-01-01713137413138210.1109/ACCESS.2019.29351168795487Person Re-Identification Based on Two-Stream Network With Attention and Pose FeaturesXiaowei Gong0https://orcid.org/0000-0002-4828-5928Suguo Zhu1Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaKey Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaDue to posture, blurring, occlusion, and other problems, person re-identification(Re-ID) remains a challenging task at present. In this paper, we combine the advantages of pose estimation and attention mechanism to better solve these problems with better performance, which combines pose and attention with two-stream network. Our proposed method mainly consists of two parts. 1) Spatial Features with Fusion Multi-Layer Features and Attention: the same pedestrian presents different gestures under different camera angles, indicating that the simple spatial information is no longer reliable. Therefore, it becomes important to distinguish view invariant features from multiple semantic levels. As a consequence, we fusion the mid-level and high-level features, and then correlate global information through self-attention. Due to fusion the mid-level and high-level features, semantic information is more abundant, which enables the attention mechanism to better focus on the important areas of the picture; 2) Aggregation Attention Stream and Pose Estimation Stream Features: although self-attention mechanism can automatically pay attention to the important areas of the image, it may pay too much focus on the prominent parts of the body and ignore the edge information of the body. Hence, the guidance of pedestrian posture is needed to make self-attention better able to pay attention to all parts of the body. Finally, we use bilinear pooling aggregates the features of two-stream as the final features. We do not use any data enhancement and re-ranking methods to achieve the $rank=1$ accuracy of 93.3% and 85.5% in Market1501 and DukeMTMC-reID datasets, respectively, which indicates the effectiveness of our method.https://ieeexplore.ieee.org/document/8795487/Attentionpose estimationperson re-identificationtwo-stream
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Xiaowei Gong Suguo Zhu
spellingShingle	Xiaowei Gong Suguo Zhu Person Re-Identification Based on Two-Stream Network With Attention and Pose Features IEEE Access Attention pose estimation person re-identification two-stream
author_facet	Xiaowei Gong Suguo Zhu
author_sort	Xiaowei Gong
title	Person Re-Identification Based on Two-Stream Network With Attention and Pose Features
title_short	Person Re-Identification Based on Two-Stream Network With Attention and Pose Features
title_full	Person Re-Identification Based on Two-Stream Network With Attention and Pose Features
title_fullStr	Person Re-Identification Based on Two-Stream Network With Attention and Pose Features
title_full_unstemmed	Person Re-Identification Based on Two-Stream Network With Attention and Pose Features
title_sort	person re-identification based on two-stream network with attention and pose features
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2019-01-01
description	Due to posture, blurring, occlusion, and other problems, person re-identification(Re-ID) remains a challenging task at present. In this paper, we combine the advantages of pose estimation and attention mechanism to better solve these problems with better performance, which combines pose and attention with two-stream network. Our proposed method mainly consists of two parts. 1) Spatial Features with Fusion Multi-Layer Features and Attention: the same pedestrian presents different gestures under different camera angles, indicating that the simple spatial information is no longer reliable. Therefore, it becomes important to distinguish view invariant features from multiple semantic levels. As a consequence, we fusion the mid-level and high-level features, and then correlate global information through self-attention. Due to fusion the mid-level and high-level features, semantic information is more abundant, which enables the attention mechanism to better focus on the important areas of the picture; 2) Aggregation Attention Stream and Pose Estimation Stream Features: although self-attention mechanism can automatically pay attention to the important areas of the image, it may pay too much focus on the prominent parts of the body and ignore the edge information of the body. Hence, the guidance of pedestrian posture is needed to make self-attention better able to pay attention to all parts of the body. Finally, we use bilinear pooling aggregates the features of two-stream as the final features. We do not use any data enhancement and re-ranking methods to achieve the $rank=1$ accuracy of 93.3% and 85.5% in Market1501 and DukeMTMC-reID datasets, respectively, which indicates the effectiveness of our method.
topic	Attention pose estimation person re-identification two-stream
url	https://ieeexplore.ieee.org/document/8795487/
work_keys_str_mv	AT xiaoweigong personreidentificationbasedontwostreamnetworkwithattentionandposefeatures AT suguozhu personreidentificationbasedontwostreamnetworkwithattentionandposefeatures
_version_	1721539945402728448

Person Re-Identification Based on Two-Stream Network With Attention and Pose Features

Similar Items