Large-Scale Video Retrieval via Deep Local Convolutional Features

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high...

Full description

Bibliographic Details
Main Authors: Chen Zhang, Bin Hu, Yucong Suo, Zhiqiang Zou, Yimu Ji
Format: Article
Language:English
Published: Hindawi Limited 2020-01-01
Series:Advances in Multimedia
Online Access:http://dx.doi.org/10.1155/2020/7862894
id doaj-0c76066329ab4459b9bef9bb269d1aa6
record_format Article
spelling doaj-0c76066329ab4459b9bef9bb269d1aa62020-11-25T03:55:49ZengHindawi LimitedAdvances in Multimedia1687-56801687-56992020-01-01202010.1155/2020/78628947862894Large-Scale Video Retrieval via Deep Local Convolutional FeaturesChen Zhang0Bin Hu1Yucong Suo2Zhiqiang Zou3Yimu Ji4College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaCollege of Geographic Science, Nanjing Normal University, Nanjing, ChinaBell Honor School, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaCollege of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaCollege of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaIn this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.http://dx.doi.org/10.1155/2020/7862894
collection DOAJ
language English
format Article
sources DOAJ
author Chen Zhang
Bin Hu
Yucong Suo
Zhiqiang Zou
Yimu Ji
spellingShingle Chen Zhang
Bin Hu
Yucong Suo
Zhiqiang Zou
Yimu Ji
Large-Scale Video Retrieval via Deep Local Convolutional Features
Advances in Multimedia
author_facet Chen Zhang
Bin Hu
Yucong Suo
Zhiqiang Zou
Yimu Ji
author_sort Chen Zhang
title Large-Scale Video Retrieval via Deep Local Convolutional Features
title_short Large-Scale Video Retrieval via Deep Local Convolutional Features
title_full Large-Scale Video Retrieval via Deep Local Convolutional Features
title_fullStr Large-Scale Video Retrieval via Deep Local Convolutional Features
title_full_unstemmed Large-Scale Video Retrieval via Deep Local Convolutional Features
title_sort large-scale video retrieval via deep local convolutional features
publisher Hindawi Limited
series Advances in Multimedia
issn 1687-5680
1687-5699
publishDate 2020-01-01
description In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.
url http://dx.doi.org/10.1155/2020/7862894
work_keys_str_mv AT chenzhang largescalevideoretrievalviadeeplocalconvolutionalfeatures
AT binhu largescalevideoretrievalviadeeplocalconvolutionalfeatures
AT yucongsuo largescalevideoretrievalviadeeplocalconvolutionalfeatures
AT zhiqiangzou largescalevideoretrievalviadeeplocalconvolutionalfeatures
AT yimuji largescalevideoretrievalviadeeplocalconvolutionalfeatures
_version_ 1715084785571856384