Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain m...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9159664/ |
id |
doaj-d474799fe089443295f6ba76bde5560e |
---|---|
record_format |
Article |
spelling |
doaj-d474799fe089443295f6ba76bde5560e2021-03-30T04:52:25ZengIEEEIEEE Access2169-35362020-01-01814452914454210.1109/ACCESS.2020.30144459159664Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional NetworkWang Li0https://orcid.org/0000-0002-9254-3082Xu Liu1https://orcid.org/0000-0002-6491-750XZheng Liu2https://orcid.org/0000-0003-3630-9950Feixiang Du3https://orcid.org/0000-0002-2769-5418Qiang Zou4https://orcid.org/0000-0003-0668-1006School of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaGraph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain more discriminative temporal features for different actions. Besides, only a single-scale feature is used for classification, which ignores the multilevel information. In this article, we propose a novel multi-scale and multi-stream improved graph convolutional network (MM-IGCN). In each spatial-temporal block of MM-IGCN, we employ an improved temporal convolution with multiple parallel kernels to enhance the temporal features. An improved GCN and an enhanced attention module are adopted in the block to strengthen spatial-temporal features. A multi-scale structure is first introduced in action recognition to obtain the multilevel information. The improved spatial-temporal blocks and multi-scale structure compose our single-stream model. Moreover, we adopt the bone cosine distance as a novel input feature. Five streams (joint, bone, their motions, and bone cosine distance) of features are fed into our single-stream model respectively, which compose our MM-IGCN. Experiments on two large datasets, NTU-RGB+D and NTU-RGB+D-120, illustrate that our single-stream model achieves state-of-the-art, and our MM-IGCN is far superior to other models.https://ieeexplore.ieee.org/document/9159664/Skeleton-based action recognitiongraph convolutional networkmulti-scalemulti-streamattention mechanismimproved spatial-temporal |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wang Li Xu Liu Zheng Liu Feixiang Du Qiang Zou |
spellingShingle |
Wang Li Xu Liu Zheng Liu Feixiang Du Qiang Zou Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network IEEE Access Skeleton-based action recognition graph convolutional network multi-scale multi-stream attention mechanism improved spatial-temporal |
author_facet |
Wang Li Xu Liu Zheng Liu Feixiang Du Qiang Zou |
author_sort |
Wang Li |
title |
Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network |
title_short |
Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network |
title_full |
Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network |
title_fullStr |
Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network |
title_full_unstemmed |
Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network |
title_sort |
skeleton-based action recognition using multi-scale and multi-stream improved graph convolutional network |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain more discriminative temporal features for different actions. Besides, only a single-scale feature is used for classification, which ignores the multilevel information. In this article, we propose a novel multi-scale and multi-stream improved graph convolutional network (MM-IGCN). In each spatial-temporal block of MM-IGCN, we employ an improved temporal convolution with multiple parallel kernels to enhance the temporal features. An improved GCN and an enhanced attention module are adopted in the block to strengthen spatial-temporal features. A multi-scale structure is first introduced in action recognition to obtain the multilevel information. The improved spatial-temporal blocks and multi-scale structure compose our single-stream model. Moreover, we adopt the bone cosine distance as a novel input feature. Five streams (joint, bone, their motions, and bone cosine distance) of features are fed into our single-stream model respectively, which compose our MM-IGCN. Experiments on two large datasets, NTU-RGB+D and NTU-RGB+D-120, illustrate that our single-stream model achieves state-of-the-art, and our MM-IGCN is far superior to other models. |
topic |
Skeleton-based action recognition graph convolutional network multi-scale multi-stream attention mechanism improved spatial-temporal |
url |
https://ieeexplore.ieee.org/document/9159664/ |
work_keys_str_mv |
AT wangli skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork AT xuliu skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork AT zhengliu skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork AT feixiangdu skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork AT qiangzou skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork |
_version_ |
1724181153701691392 |