Paralleled attention modules and adaptive focal loss for Siamese visual tracking
Abstract Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance on different tracking benchmarks. However, most Siamese‐based trackers encounter difficulties under circumstances such as similar objects interference and backgro...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2021-05-01
|
Series: | IET Image Processing |
Online Access: | https://doi.org/10.1049/ipr2.12109 |
id |
doaj-3c7c237c9fd943c0a25ca1ea8b8af316 |
---|---|
record_format |
Article |
spelling |
doaj-3c7c237c9fd943c0a25ca1ea8b8af3162021-07-14T13:25:04ZengWileyIET Image Processing1751-96591751-96672021-05-011561345135810.1049/ipr2.12109Paralleled attention modules and adaptive focal loss for Siamese visual trackingYuyao Zhao0Min Jiang1Jun Kong2Sha Li3Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Jiangnan University Wuxi ChinaJiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Jiangnan University Wuxi ChinaJiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Jiangnan University Wuxi ChinaJiangsu Key Construction Laboratory of IoT Application Technology Wuxi Taihu University Wuxi ChinaAbstract Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance on different tracking benchmarks. However, most Siamese‐based trackers encounter difficulties under circumstances such as similar objects interference and background clutters. Besides, there exists an extreme foreground–background data imbalance that weakens the performance during training but few loss functions pay attention to it. The authors intend to address the issues mentioned above by introducing a module named paralleled spatial and channel attention (PSCA) and adaptive focal loss (AFL). Firstly, paralleled spatial and channel attention is proposed to enhance the extracted features and eliminate the noise information from both spatial and channel aspects. Secondly, adaptive focal loss is proposed as the loss function to make the model focus on hard samples that contribute more to training process. Finally, paralleled spatial and channel attention and modified ResNet are combined for extracting more powerful features. Experimental results show that the authors' method achieves outstanding performance in multiple benchmarks while keeping a beyond‐real‐time frame rate.https://doi.org/10.1049/ipr2.12109 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yuyao Zhao Min Jiang Jun Kong Sha Li |
spellingShingle |
Yuyao Zhao Min Jiang Jun Kong Sha Li Paralleled attention modules and adaptive focal loss for Siamese visual tracking IET Image Processing |
author_facet |
Yuyao Zhao Min Jiang Jun Kong Sha Li |
author_sort |
Yuyao Zhao |
title |
Paralleled attention modules and adaptive focal loss for Siamese visual tracking |
title_short |
Paralleled attention modules and adaptive focal loss for Siamese visual tracking |
title_full |
Paralleled attention modules and adaptive focal loss for Siamese visual tracking |
title_fullStr |
Paralleled attention modules and adaptive focal loss for Siamese visual tracking |
title_full_unstemmed |
Paralleled attention modules and adaptive focal loss for Siamese visual tracking |
title_sort |
paralleled attention modules and adaptive focal loss for siamese visual tracking |
publisher |
Wiley |
series |
IET Image Processing |
issn |
1751-9659 1751-9667 |
publishDate |
2021-05-01 |
description |
Abstract Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance on different tracking benchmarks. However, most Siamese‐based trackers encounter difficulties under circumstances such as similar objects interference and background clutters. Besides, there exists an extreme foreground–background data imbalance that weakens the performance during training but few loss functions pay attention to it. The authors intend to address the issues mentioned above by introducing a module named paralleled spatial and channel attention (PSCA) and adaptive focal loss (AFL). Firstly, paralleled spatial and channel attention is proposed to enhance the extracted features and eliminate the noise information from both spatial and channel aspects. Secondly, adaptive focal loss is proposed as the loss function to make the model focus on hard samples that contribute more to training process. Finally, paralleled spatial and channel attention and modified ResNet are combined for extracting more powerful features. Experimental results show that the authors' method achieves outstanding performance in multiple benchmarks while keeping a beyond‐real‐time frame rate. |
url |
https://doi.org/10.1049/ipr2.12109 |
work_keys_str_mv |
AT yuyaozhao paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking AT minjiang paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking AT junkong paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking AT shali paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking |
_version_ |
1721302784338296832 |