Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network

Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain m...

Full description

Bibliographic Details
Main Authors: Wang Li, Xu Liu, Zheng Liu, Feixiang Du, Qiang Zou
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9159664/
id doaj-d474799fe089443295f6ba76bde5560e
record_format Article
spelling doaj-d474799fe089443295f6ba76bde5560e2021-03-30T04:52:25ZengIEEEIEEE Access2169-35362020-01-01814452914454210.1109/ACCESS.2020.30144459159664Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional NetworkWang Li0https://orcid.org/0000-0002-9254-3082Xu Liu1https://orcid.org/0000-0002-6491-750XZheng Liu2https://orcid.org/0000-0003-3630-9950Feixiang Du3https://orcid.org/0000-0002-2769-5418Qiang Zou4https://orcid.org/0000-0003-0668-1006School of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaSchool of Microelectronics, Tianjin University, Tianjin, ChinaGraph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain more discriminative temporal features for different actions. Besides, only a single-scale feature is used for classification, which ignores the multilevel information. In this article, we propose a novel multi-scale and multi-stream improved graph convolutional network (MM-IGCN). In each spatial-temporal block of MM-IGCN, we employ an improved temporal convolution with multiple parallel kernels to enhance the temporal features. An improved GCN and an enhanced attention module are adopted in the block to strengthen spatial-temporal features. A multi-scale structure is first introduced in action recognition to obtain the multilevel information. The improved spatial-temporal blocks and multi-scale structure compose our single-stream model. Moreover, we adopt the bone cosine distance as a novel input feature. Five streams (joint, bone, their motions, and bone cosine distance) of features are fed into our single-stream model respectively, which compose our MM-IGCN. Experiments on two large datasets, NTU-RGB+D and NTU-RGB+D-120, illustrate that our single-stream model achieves state-of-the-art, and our MM-IGCN is far superior to other models.https://ieeexplore.ieee.org/document/9159664/Skeleton-based action recognitiongraph convolutional networkmulti-scalemulti-streamattention mechanismimproved spatial-temporal
collection DOAJ
language English
format Article
sources DOAJ
author Wang Li
Xu Liu
Zheng Liu
Feixiang Du
Qiang Zou
spellingShingle Wang Li
Xu Liu
Zheng Liu
Feixiang Du
Qiang Zou
Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
IEEE Access
Skeleton-based action recognition
graph convolutional network
multi-scale
multi-stream
attention mechanism
improved spatial-temporal
author_facet Wang Li
Xu Liu
Zheng Liu
Feixiang Du
Qiang Zou
author_sort Wang Li
title Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
title_short Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
title_full Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
title_fullStr Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
title_full_unstemmed Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
title_sort skeleton-based action recognition using multi-scale and multi-stream improved graph convolutional network
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain more discriminative temporal features for different actions. Besides, only a single-scale feature is used for classification, which ignores the multilevel information. In this article, we propose a novel multi-scale and multi-stream improved graph convolutional network (MM-IGCN). In each spatial-temporal block of MM-IGCN, we employ an improved temporal convolution with multiple parallel kernels to enhance the temporal features. An improved GCN and an enhanced attention module are adopted in the block to strengthen spatial-temporal features. A multi-scale structure is first introduced in action recognition to obtain the multilevel information. The improved spatial-temporal blocks and multi-scale structure compose our single-stream model. Moreover, we adopt the bone cosine distance as a novel input feature. Five streams (joint, bone, their motions, and bone cosine distance) of features are fed into our single-stream model respectively, which compose our MM-IGCN. Experiments on two large datasets, NTU-RGB+D and NTU-RGB+D-120, illustrate that our single-stream model achieves state-of-the-art, and our MM-IGCN is far superior to other models.
topic Skeleton-based action recognition
graph convolutional network
multi-scale
multi-stream
attention mechanism
improved spatial-temporal
url https://ieeexplore.ieee.org/document/9159664/
work_keys_str_mv AT wangli skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork
AT xuliu skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork
AT zhengliu skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork
AT feixiangdu skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork
AT qiangzou skeletonbasedactionrecognitionusingmultiscaleandmultistreamimprovedgraphconvolutionalnetwork
_version_ 1724181153701691392