Human Action Recognition Algorithm Based on Multi-Feature Map Fusion

The emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To s...

Full description

Bibliographic Details
Main Authors: Haofei Wang, Junfeng Li
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9169613/
id doaj-596ca8945e53456dae9f2ea03cb4d7ba
record_format Article
spelling doaj-596ca8945e53456dae9f2ea03cb4d7ba2021-03-30T04:46:58ZengIEEEIEEE Access2169-35362020-01-01815094515095410.1109/ACCESS.2020.30170769169613Human Action Recognition Algorithm Based on Multi-Feature Map FusionHaofei Wang0https://orcid.org/0000-0002-3898-0927Junfeng Li1https://orcid.org/0000-0002-1207-3317Department of Control Science and Engineering, Zhejiang Sci-Tech University, Hangzhou, ChinaDepartment of Control Science and Engineering, Zhejiang Sci-Tech University, Hangzhou, ChinaThe emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To solve this problem, on the basis of resnext human action recognition method, we propose an improved resnext human action recognition method based on multi-feature map fusion. First, the video is uniformly sampled to generate training samples, and we generate samples with different frames as the input to the network. Second, we add n layers of up-sampling layers after layer 1 of resnext, to enlarge the feature maps and extract multiple feature maps, so that the extracted feature maps are clearer, and small targets can be better recognized. Finally, for the n results obtained, we use the weighted geometric means combination forecasting method based on L_1 norm to fuse and obtain the final result. In the process of experiment, using UCF-101 and HMDB-51 for verification, the accuracy of our model is 90.3% on UCF-101, which is higher than most of the state-of-art algorithms.https://ieeexplore.ieee.org/document/9169613/Human action recognitionresnext-101 networkup-sampling methodweight fusion
collection DOAJ
language English
format Article
sources DOAJ
author Haofei Wang
Junfeng Li
spellingShingle Haofei Wang
Junfeng Li
Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
IEEE Access
Human action recognition
resnext-101 network
up-sampling method
weight fusion
author_facet Haofei Wang
Junfeng Li
author_sort Haofei Wang
title Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
title_short Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
title_full Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
title_fullStr Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
title_full_unstemmed Human Action Recognition Algorithm Based on Multi-Feature Map Fusion
title_sort human action recognition algorithm based on multi-feature map fusion
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To solve this problem, on the basis of resnext human action recognition method, we propose an improved resnext human action recognition method based on multi-feature map fusion. First, the video is uniformly sampled to generate training samples, and we generate samples with different frames as the input to the network. Second, we add n layers of up-sampling layers after layer 1 of resnext, to enlarge the feature maps and extract multiple feature maps, so that the extracted feature maps are clearer, and small targets can be better recognized. Finally, for the n results obtained, we use the weighted geometric means combination forecasting method based on L_1 norm to fuse and obtain the final result. In the process of experiment, using UCF-101 and HMDB-51 for verification, the accuracy of our model is 90.3% on UCF-101, which is higher than most of the state-of-art algorithms.
topic Human action recognition
resnext-101 network
up-sampling method
weight fusion
url https://ieeexplore.ieee.org/document/9169613/
work_keys_str_mv AT haofeiwang humanactionrecognitionalgorithmbasedonmultifeaturemapfusion
AT junfengli humanactionrecognitionalgorithmbasedonmultifeaturemapfusion
_version_ 1724181216448479232