DeepPear: Deep Pose Estimation and Action Recognition

碩士 === 國立交通大學 === 多媒體工程研究所 === 107 === Over the last few years, human action recognition has been a popular issue since this task can be applied in multiple applications such as intelligent surveillance systems, autonomous vehicle control and robotics. Human action recognition using RGB video is dif...

Full description

Bibliographic Details
Main Authors: Jhuang, You-Ying, 莊侑穎
Other Authors: Tsai, Wen-Jiin
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/52d4yz
Description
Summary:碩士 === 國立交通大學 === 多媒體工程研究所 === 107 === Over the last few years, human action recognition has been a popular issue since this task can be applied in multiple applications such as intelligent surveillance systems, autonomous vehicle control and robotics. Human action recognition using RGB video is difficult because the learning of actions could be affected by the cluttered background. In contrast to most video-based action recognition approaches which use 3D convolutional neural networks, the proposed method estimates 3D human pose first which can help remove the cluttered background and focus on human body. This makes the action learning not be overfitted by the cluttered background. Besides human pose, the proposed method also utilizes the RGB features nearby the predicted human joints to make our action prediction context-aware. After human pose estimation and RGB feature extraction, the proposed method uses a two-stream architecture to handle action recognition. Experimental results show that the proposed method outperformed many state-of-the-arts on NTU RGB+D which is a large-scale human action recognition dataset.