Person Re-Identification Based on Two-Stream Network With Attention and Pose Features
Due to posture, blurring, occlusion, and other problems, person re-identification(Re-ID) remains a challenging task at present. In this paper, we combine the advantages of pose estimation and attention mechanism to better solve these problems with better performance, which combines pose and attentio...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8795487/ |
id |
doaj-b4dfba4b039a49b389363482e0c795de |
---|---|
record_format |
Article |
spelling |
doaj-b4dfba4b039a49b389363482e0c795de2021-04-05T17:17:04ZengIEEEIEEE Access2169-35362019-01-01713137413138210.1109/ACCESS.2019.29351168795487Person Re-Identification Based on Two-Stream Network With Attention and Pose FeaturesXiaowei Gong0https://orcid.org/0000-0002-4828-5928Suguo Zhu1Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaKey Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaDue to posture, blurring, occlusion, and other problems, person re-identification(Re-ID) remains a challenging task at present. In this paper, we combine the advantages of pose estimation and attention mechanism to better solve these problems with better performance, which combines pose and attention with two-stream network. Our proposed method mainly consists of two parts. 1) Spatial Features with Fusion Multi-Layer Features and Attention: the same pedestrian presents different gestures under different camera angles, indicating that the simple spatial information is no longer reliable. Therefore, it becomes important to distinguish view invariant features from multiple semantic levels. As a consequence, we fusion the mid-level and high-level features, and then correlate global information through self-attention. Due to fusion the mid-level and high-level features, semantic information is more abundant, which enables the attention mechanism to better focus on the important areas of the picture; 2) Aggregation Attention Stream and Pose Estimation Stream Features: although self-attention mechanism can automatically pay attention to the important areas of the image, it may pay too much focus on the prominent parts of the body and ignore the edge information of the body. Hence, the guidance of pedestrian posture is needed to make self-attention better able to pay attention to all parts of the body. Finally, we use bilinear pooling aggregates the features of two-stream as the final features. We do not use any data enhancement and re-ranking methods to achieve the $rank=1$ accuracy of 93.3% and 85.5% in Market1501 and DukeMTMC-reID datasets, respectively, which indicates the effectiveness of our method.https://ieeexplore.ieee.org/document/8795487/Attentionpose estimationperson re-identificationtwo-stream |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xiaowei Gong Suguo Zhu |
spellingShingle |
Xiaowei Gong Suguo Zhu Person Re-Identification Based on Two-Stream Network With Attention and Pose Features IEEE Access Attention pose estimation person re-identification two-stream |
author_facet |
Xiaowei Gong Suguo Zhu |
author_sort |
Xiaowei Gong |
title |
Person Re-Identification Based on Two-Stream Network With Attention and Pose Features |
title_short |
Person Re-Identification Based on Two-Stream Network With Attention and Pose Features |
title_full |
Person Re-Identification Based on Two-Stream Network With Attention and Pose Features |
title_fullStr |
Person Re-Identification Based on Two-Stream Network With Attention and Pose Features |
title_full_unstemmed |
Person Re-Identification Based on Two-Stream Network With Attention and Pose Features |
title_sort |
person re-identification based on two-stream network with attention and pose features |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Due to posture, blurring, occlusion, and other problems, person re-identification(Re-ID) remains a challenging task at present. In this paper, we combine the advantages of pose estimation and attention mechanism to better solve these problems with better performance, which combines pose and attention with two-stream network. Our proposed method mainly consists of two parts. 1) Spatial Features with Fusion Multi-Layer Features and Attention: the same pedestrian presents different gestures under different camera angles, indicating that the simple spatial information is no longer reliable. Therefore, it becomes important to distinguish view invariant features from multiple semantic levels. As a consequence, we fusion the mid-level and high-level features, and then correlate global information through self-attention. Due to fusion the mid-level and high-level features, semantic information is more abundant, which enables the attention mechanism to better focus on the important areas of the picture; 2) Aggregation Attention Stream and Pose Estimation Stream Features: although self-attention mechanism can automatically pay attention to the important areas of the image, it may pay too much focus on the prominent parts of the body and ignore the edge information of the body. Hence, the guidance of pedestrian posture is needed to make self-attention better able to pay attention to all parts of the body. Finally, we use bilinear pooling aggregates the features of two-stream as the final features. We do not use any data enhancement and re-ranking methods to achieve the $rank=1$ accuracy of 93.3% and 85.5% in Market1501 and DukeMTMC-reID datasets, respectively, which indicates the effectiveness of our method. |
topic |
Attention pose estimation person re-identification two-stream |
url |
https://ieeexplore.ieee.org/document/8795487/ |
work_keys_str_mv |
AT xiaoweigong personreidentificationbasedontwostreamnetworkwithattentionandposefeatures AT suguozhu personreidentificationbasedontwostreamnetworkwithattentionandposefeatures |
_version_ |
1721539945402728448 |