Short Video Behavior Recognition Combining Scene and Behavior Features
The behavior recognition method pays more attention to the action itself, but the short video contains less information. And it is necessary to utilize various feature information in the video as much as possible to improve the accuracy of behavioral recognition. Therefore, the short video behavior...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
2020-10-01
|
Series: | Jisuanji kexue yu tansuo |
Subjects: | |
Online Access: | http://fcst.ceaj.org/CN/abstract/abstract2412.shtml |
id |
doaj-3f1e97284d4c446486070ff20cbdf291 |
---|---|
record_format |
Article |
spelling |
doaj-3f1e97284d4c446486070ff20cbdf2912021-08-10T08:25:58ZzhoJournal of Computer Engineering and Applications Beijing Co., Ltd., Science PressJisuanji kexue yu tansuo1673-94182020-10-0114101754176110.3778/j.issn.1673-9418.1909044Short Video Behavior Recognition Combining Scene and Behavior FeaturesDONG Xu, TAN Li, ZHOU Lina, SONG Yanyan0School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, ChinaThe behavior recognition method pays more attention to the action itself, but the short video contains less information. And it is necessary to utilize various feature information in the video as much as possible to improve the accuracy of behavioral recognition. Therefore, the short video behavior recognition method based on scene and behavior joint features is studied, and the scene information is used as context information to improve the effect of traditional single behavior recognition network. First, the scene features in the short video are extracted using a deep fusion network. Then, the behavioral features in the short video utilize the variable convolutional network for RGB features and flow features extraction. Finally, the dictionary learning method is used to sparsely represent the joint features, and more explanatory feature information is extracted for short video behavior recognition. The top-5 accuracy rate in the Charades test set is 33%. It is superior to the traditional single behavior recognition network, making the behavior recognition effect more accurate.http://fcst.ceaj.org/CN/abstract/abstract2412.shtmlscene recognitionaction recognitiondictionary learningdeep learningvideo understanding |
collection |
DOAJ |
language |
zho |
format |
Article |
sources |
DOAJ |
author |
DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan |
spellingShingle |
DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan Short Video Behavior Recognition Combining Scene and Behavior Features Jisuanji kexue yu tansuo scene recognition action recognition dictionary learning deep learning video understanding |
author_facet |
DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan |
author_sort |
DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan |
title |
Short Video Behavior Recognition Combining Scene and Behavior Features |
title_short |
Short Video Behavior Recognition Combining Scene and Behavior Features |
title_full |
Short Video Behavior Recognition Combining Scene and Behavior Features |
title_fullStr |
Short Video Behavior Recognition Combining Scene and Behavior Features |
title_full_unstemmed |
Short Video Behavior Recognition Combining Scene and Behavior Features |
title_sort |
short video behavior recognition combining scene and behavior features |
publisher |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press |
series |
Jisuanji kexue yu tansuo |
issn |
1673-9418 |
publishDate |
2020-10-01 |
description |
The behavior recognition method pays more attention to the action itself, but the short video contains less information. And it is necessary to utilize various feature information in the video as much as possible to improve the accuracy of behavioral recognition. Therefore, the short video behavior recognition method based on scene and behavior joint features is studied, and the scene information is used as context information to improve the effect of traditional single behavior recognition network. First, the scene features in the short video are extracted using a deep fusion network. Then, the behavioral features in the short video utilize the variable convolutional network for RGB features and flow features extraction. Finally, the dictionary learning method is used to sparsely represent the joint features, and more explanatory feature information is extracted for short video behavior recognition. The top-5 accuracy rate in the Charades test set is 33%. It is superior to the traditional single behavior recognition network, making the behavior recognition effect more accurate. |
topic |
scene recognition action recognition dictionary learning deep learning video understanding |
url |
http://fcst.ceaj.org/CN/abstract/abstract2412.shtml |
work_keys_str_mv |
AT dongxutanlizhoulinasongyanyan shortvideobehaviorrecognitioncombiningsceneandbehaviorfeatures |
_version_ |
1721212446256922624 |