Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
The emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To s...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9169613/ |
id |
doaj-596ca8945e53456dae9f2ea03cb4d7ba |
---|---|
record_format |
Article |
spelling |
doaj-596ca8945e53456dae9f2ea03cb4d7ba2021-03-30T04:46:58ZengIEEEIEEE Access2169-35362020-01-01815094515095410.1109/ACCESS.2020.30170769169613Human Action Recognition Algorithm Based on Multi-Feature Map FusionHaofei Wang0https://orcid.org/0000-0002-3898-0927Junfeng Li1https://orcid.org/0000-0002-1207-3317Department of Control Science and Engineering, Zhejiang Sci-Tech University, Hangzhou, ChinaDepartment of Control Science and Engineering, Zhejiang Sci-Tech University, Hangzhou, ChinaThe emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To solve this problem, on the basis of resnext human action recognition method, we propose an improved resnext human action recognition method based on multi-feature map fusion. First, the video is uniformly sampled to generate training samples, and we generate samples with different frames as the input to the network. Second, we add n layers of up-sampling layers after layer 1 of resnext, to enlarge the feature maps and extract multiple feature maps, so that the extracted feature maps are clearer, and small targets can be better recognized. Finally, for the n results obtained, we use the weighted geometric means combination forecasting method based on L_1 norm to fuse and obtain the final result. In the process of experiment, using UCF-101 and HMDB-51 for verification, the accuracy of our model is 90.3% on UCF-101, which is higher than most of the state-of-art algorithms.https://ieeexplore.ieee.org/document/9169613/Human action recognitionresnext-101 networkup-sampling methodweight fusion |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Haofei Wang Junfeng Li |
spellingShingle |
Haofei Wang Junfeng Li Human Action Recognition Algorithm Based on Multi-Feature Map Fusion IEEE Access Human action recognition resnext-101 network up-sampling method weight fusion |
author_facet |
Haofei Wang Junfeng Li |
author_sort |
Haofei Wang |
title |
Human Action Recognition Algorithm Based on Multi-Feature Map Fusion |
title_short |
Human Action Recognition Algorithm Based on Multi-Feature Map Fusion |
title_full |
Human Action Recognition Algorithm Based on Multi-Feature Map Fusion |
title_fullStr |
Human Action Recognition Algorithm Based on Multi-Feature Map Fusion |
title_full_unstemmed |
Human Action Recognition Algorithm Based on Multi-Feature Map Fusion |
title_sort |
human action recognition algorithm based on multi-feature map fusion |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
The emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To solve this problem, on the basis of resnext human action recognition method, we propose an improved resnext human action recognition method based on multi-feature map fusion. First, the video is uniformly sampled to generate training samples, and we generate samples with different frames as the input to the network. Second, we add n layers of up-sampling layers after layer 1 of resnext, to enlarge the feature maps and extract multiple feature maps, so that the extracted feature maps are clearer, and small targets can be better recognized. Finally, for the n results obtained, we use the weighted geometric means combination forecasting method based on L_1 norm to fuse and obtain the final result. In the process of experiment, using UCF-101 and HMDB-51 for verification, the accuracy of our model is 90.3% on UCF-101, which is higher than most of the state-of-art algorithms. |
topic |
Human action recognition resnext-101 network up-sampling method weight fusion |
url |
https://ieeexplore.ieee.org/document/9169613/ |
work_keys_str_mv |
AT haofeiwang humanactionrecognitionalgorithmbasedonmultifeaturemapfusion AT junfengli humanactionrecognitionalgorithmbasedonmultifeaturemapfusion |
_version_ |
1724181216448479232 |