Short Video Behavior Recognition Combining Scene and Behavior Features

The behavior recognition method pays more attention to the action itself, but the short video contains less information. And it is necessary to utilize various feature information in the video as much as possible to improve the accuracy of behavioral recognition. Therefore, the short video behavior...

Full description

Bibliographic Details
Main Author: DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan
Format: Article
Language:zho
Published: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2020-10-01
Series:Jisuanji kexue yu tansuo
Subjects:
Online Access:http://fcst.ceaj.org/CN/abstract/abstract2412.shtml
id doaj-3f1e97284d4c446486070ff20cbdf291
record_format Article
spelling doaj-3f1e97284d4c446486070ff20cbdf2912021-08-10T08:25:58ZzhoJournal of Computer Engineering and Applications Beijing Co., Ltd., Science PressJisuanji kexue yu tansuo1673-94182020-10-0114101754176110.3778/j.issn.1673-9418.1909044Short Video Behavior Recognition Combining Scene and Behavior FeaturesDONG Xu, TAN Li, ZHOU Lina, SONG Yanyan0School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, ChinaThe behavior recognition method pays more attention to the action itself, but the short video contains less information. And it is necessary to utilize various feature information in the video as much as possible to improve the accuracy of behavioral recognition. Therefore, the short video behavior recognition method based on scene and behavior joint features is studied, and the scene information is used as context information to improve the effect of traditional single behavior recognition network. First, the scene features in the short video are extracted using a deep fusion network. Then, the behavioral features in the short video utilize the variable convolutional network for RGB features and flow features extraction. Finally, the dictionary learning method is used to sparsely represent the joint features, and more explanatory feature information is extracted for short video behavior recognition. The top-5 accuracy rate in the Charades test set is 33%. It is superior to the traditional single behavior recognition network, making the behavior recognition effect more accurate.http://fcst.ceaj.org/CN/abstract/abstract2412.shtmlscene recognitionaction recognitiondictionary learningdeep learningvideo understanding
collection DOAJ
language zho
format Article
sources DOAJ
author DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan
spellingShingle DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan
Short Video Behavior Recognition Combining Scene and Behavior Features
Jisuanji kexue yu tansuo
scene recognition
action recognition
dictionary learning
deep learning
video understanding
author_facet DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan
author_sort DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan
title Short Video Behavior Recognition Combining Scene and Behavior Features
title_short Short Video Behavior Recognition Combining Scene and Behavior Features
title_full Short Video Behavior Recognition Combining Scene and Behavior Features
title_fullStr Short Video Behavior Recognition Combining Scene and Behavior Features
title_full_unstemmed Short Video Behavior Recognition Combining Scene and Behavior Features
title_sort short video behavior recognition combining scene and behavior features
publisher Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
series Jisuanji kexue yu tansuo
issn 1673-9418
publishDate 2020-10-01
description The behavior recognition method pays more attention to the action itself, but the short video contains less information. And it is necessary to utilize various feature information in the video as much as possible to improve the accuracy of behavioral recognition. Therefore, the short video behavior recognition method based on scene and behavior joint features is studied, and the scene information is used as context information to improve the effect of traditional single behavior recognition network. First, the scene features in the short video are extracted using a deep fusion network. Then, the behavioral features in the short video utilize the variable convolutional network for RGB features and flow features extraction. Finally, the dictionary learning method is used to sparsely represent the joint features, and more explanatory feature information is extracted for short video behavior recognition. The top-5 accuracy rate in the Charades test set is 33%. It is superior to the traditional single behavior recognition network, making the behavior recognition effect more accurate.
topic scene recognition
action recognition
dictionary learning
deep learning
video understanding
url http://fcst.ceaj.org/CN/abstract/abstract2412.shtml
work_keys_str_mv AT dongxutanlizhoulinasongyanyan shortvideobehaviorrecognitioncombiningsceneandbehaviorfeatures
_version_ 1721212446256922624