Paralleled attention modules and adaptive focal loss for Siamese visual tracking

Abstract Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance on different tracking benchmarks. However, most Siamese‐based trackers encounter difficulties under circumstances such as similar objects interference and backgro...

Full description

Bibliographic Details
Main Authors: Yuyao Zhao, Min Jiang, Jun Kong, Sha Li
Format: Article
Language:English
Published: Wiley 2021-05-01
Series:IET Image Processing
Online Access:https://doi.org/10.1049/ipr2.12109
id doaj-3c7c237c9fd943c0a25ca1ea8b8af316
record_format Article
spelling doaj-3c7c237c9fd943c0a25ca1ea8b8af3162021-07-14T13:25:04ZengWileyIET Image Processing1751-96591751-96672021-05-011561345135810.1049/ipr2.12109Paralleled attention modules and adaptive focal loss for Siamese visual trackingYuyao Zhao0Min Jiang1Jun Kong2Sha Li3Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Jiangnan University Wuxi ChinaJiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Jiangnan University Wuxi ChinaJiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence Jiangnan University Wuxi ChinaJiangsu Key Construction Laboratory of IoT Application Technology Wuxi Taihu University Wuxi ChinaAbstract Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance on different tracking benchmarks. However, most Siamese‐based trackers encounter difficulties under circumstances such as similar objects interference and background clutters. Besides, there exists an extreme foreground–background data imbalance that weakens the performance during training but few loss functions pay attention to it. The authors intend to address the issues mentioned above by introducing a module named paralleled spatial and channel attention (PSCA) and adaptive focal loss (AFL). Firstly, paralleled spatial and channel attention is proposed to enhance the extracted features and eliminate the noise information from both spatial and channel aspects. Secondly, adaptive focal loss is proposed as the loss function to make the model focus on hard samples that contribute more to training process. Finally, paralleled spatial and channel attention and modified ResNet are combined for extracting more powerful features. Experimental results show that the authors' method achieves outstanding performance in multiple benchmarks while keeping a beyond‐real‐time frame rate.https://doi.org/10.1049/ipr2.12109
collection DOAJ
language English
format Article
sources DOAJ
author Yuyao Zhao
Min Jiang
Jun Kong
Sha Li
spellingShingle Yuyao Zhao
Min Jiang
Jun Kong
Sha Li
Paralleled attention modules and adaptive focal loss for Siamese visual tracking
IET Image Processing
author_facet Yuyao Zhao
Min Jiang
Jun Kong
Sha Li
author_sort Yuyao Zhao
title Paralleled attention modules and adaptive focal loss for Siamese visual tracking
title_short Paralleled attention modules and adaptive focal loss for Siamese visual tracking
title_full Paralleled attention modules and adaptive focal loss for Siamese visual tracking
title_fullStr Paralleled attention modules and adaptive focal loss for Siamese visual tracking
title_full_unstemmed Paralleled attention modules and adaptive focal loss for Siamese visual tracking
title_sort paralleled attention modules and adaptive focal loss for siamese visual tracking
publisher Wiley
series IET Image Processing
issn 1751-9659
1751-9667
publishDate 2021-05-01
description Abstract Recently, Siamese‐based trackers have drawn amounts of attention in visual tracking field because of their excellent performance on different tracking benchmarks. However, most Siamese‐based trackers encounter difficulties under circumstances such as similar objects interference and background clutters. Besides, there exists an extreme foreground–background data imbalance that weakens the performance during training but few loss functions pay attention to it. The authors intend to address the issues mentioned above by introducing a module named paralleled spatial and channel attention (PSCA) and adaptive focal loss (AFL). Firstly, paralleled spatial and channel attention is proposed to enhance the extracted features and eliminate the noise information from both spatial and channel aspects. Secondly, adaptive focal loss is proposed as the loss function to make the model focus on hard samples that contribute more to training process. Finally, paralleled spatial and channel attention and modified ResNet are combined for extracting more powerful features. Experimental results show that the authors' method achieves outstanding performance in multiple benchmarks while keeping a beyond‐real‐time frame rate.
url https://doi.org/10.1049/ipr2.12109
work_keys_str_mv AT yuyaozhao paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking
AT minjiang paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking
AT junkong paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking
AT shali paralleledattentionmodulesandadaptivefocallossforsiamesevisualtracking
_version_ 1721302784338296832