Large-Scale Video Retrieval via Deep Local Convolutional Features
In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2020-01-01
|
Series: | Advances in Multimedia |
Online Access: | http://dx.doi.org/10.1155/2020/7862894 |
id |
doaj-0c76066329ab4459b9bef9bb269d1aa6 |
---|---|
record_format |
Article |
spelling |
doaj-0c76066329ab4459b9bef9bb269d1aa62020-11-25T03:55:49ZengHindawi LimitedAdvances in Multimedia1687-56801687-56992020-01-01202010.1155/2020/78628947862894Large-Scale Video Retrieval via Deep Local Convolutional FeaturesChen Zhang0Bin Hu1Yucong Suo2Zhiqiang Zou3Yimu Ji4College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaCollege of Geographic Science, Nanjing Normal University, Nanjing, ChinaBell Honor School, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaCollege of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaCollege of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, ChinaIn this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.http://dx.doi.org/10.1155/2020/7862894 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chen Zhang Bin Hu Yucong Suo Zhiqiang Zou Yimu Ji |
spellingShingle |
Chen Zhang Bin Hu Yucong Suo Zhiqiang Zou Yimu Ji Large-Scale Video Retrieval via Deep Local Convolutional Features Advances in Multimedia |
author_facet |
Chen Zhang Bin Hu Yucong Suo Zhiqiang Zou Yimu Ji |
author_sort |
Chen Zhang |
title |
Large-Scale Video Retrieval via Deep Local Convolutional Features |
title_short |
Large-Scale Video Retrieval via Deep Local Convolutional Features |
title_full |
Large-Scale Video Retrieval via Deep Local Convolutional Features |
title_fullStr |
Large-Scale Video Retrieval via Deep Local Convolutional Features |
title_full_unstemmed |
Large-Scale Video Retrieval via Deep Local Convolutional Features |
title_sort |
large-scale video retrieval via deep local convolutional features |
publisher |
Hindawi Limited |
series |
Advances in Multimedia |
issn |
1687-5680 1687-5699 |
publishDate |
2020-01-01 |
description |
In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods. |
url |
http://dx.doi.org/10.1155/2020/7862894 |
work_keys_str_mv |
AT chenzhang largescalevideoretrievalviadeeplocalconvolutionalfeatures AT binhu largescalevideoretrievalviadeeplocalconvolutionalfeatures AT yucongsuo largescalevideoretrievalviadeeplocalconvolutionalfeatures AT zhiqiangzou largescalevideoretrievalviadeeplocalconvolutionalfeatures AT yimuji largescalevideoretrievalviadeeplocalconvolutionalfeatures |
_version_ |
1715084785571856384 |