Learning global image representation with generalized-mean pooling and smoothed average precision for large-scale CBIR

Content-based image retrieval (CBIR) is the problem of searching for items in an image database that are similar to the query image. Most of the existing image retrieval methods are trained based on metric learning loss functions (e.g. contrastive loss or triplet loss), however, which require the us...

Full description

Bibliographic Details
Main Authors: Li, Y. (Author), Wang, C. (Author), Yang, B. (Author), Yao, J. (Author)
Format: Article
Language:English
Published: John Wiley and Sons Inc 2023
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02109nam a2200205Ia 4500
001 10.1049-ipr2.12825
008 230526s2023 CNT 000 0 und d
020 |a 17519659 (ISSN) 
245 1 0 |a Learning global image representation with generalized-mean pooling and smoothed average precision for large-scale CBIR 
260 0 |b John Wiley and Sons Inc  |c 2023 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1049/ipr2.12825 
520 3 |a Content-based image retrieval (CBIR) is the problem of searching for items in an image database that are similar to the query image. Most of the existing image retrieval methods are trained based on metric learning loss functions (e.g. contrastive loss or triplet loss), however, which require the use of hard sample mining strategies (HMS) to better train the model. The HMS implies that picking out hard positive or negative samples increases the complexity of model training and requires a large amount of additional training time. To address this issue, lessons from recent work are leveraged on representation learning and a model called GS is proposed that combines the state-of-the-art Generalized-Mean (GeM) pooling and the smoothed average precision (AP). The entire network can be learned end-to-end by approximating the non-differentiable AP function to a differentiable one-without mining hard samples, only image-level annotations. A model named GSA is also presented which achieves excellent retrieval performance jointly trained by two various loss functions. Experimental results validate the effectiveness of the proposed approach and demonstrate the competitive performance on a common standard image retrieval dataset (Revisited Oxford and Paris). © 2023 The Authors. IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology. 
650 0 4 |a computer vision 
650 0 4 |a content-based retrieval 
650 0 4 |a image retrieval 
700 1 0 |a Li, Y.  |e author 
700 1 0 |a Wang, C.  |e author 
700 1 0 |a Yang, B.  |e author 
700 1 0 |a Yao, J.  |e author 
773 |t IET Image Processing  |x 17519659 (ISSN)