Learning a Deep Network with Spherical Part Model for 3D Hand Pose Estimation

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 104 === 3D hand pose estimation, which is to find the locations of joints in a hand, is a hot research topic in recent years. It has been widely used in many advanced applications for virtual reality and human-computer interaction, since it provides a natural interface...

Full description

Bibliographic Details
Main Authors: Tzu-Yang Chen, 陳子揚
Other Authors: 傅立成
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/94958919666483752198
Description
Summary:碩士 === 國立臺灣大學 === 資訊工程學研究所 === 104 === 3D hand pose estimation, which is to find the locations of joints in a hand, is a hot research topic in recent years. It has been widely used in many advanced applications for virtual reality and human-computer interaction, since it provides a natural interface for communication between human and cyberspace that achieve a wonderful user experience. Despite the fast development of this field, 3D hand pose estimation is still a difficult task due to the various challenges. First, we need to detect human hand, which is changeable based on different viewpoints, distance, and subjects, from complex environment. Second, the high degree of freedom and serious self-occlusions lead to difficulties in pose estimation. However, the rise of depth sensors and deep learning brings the possibility to accomplish such a challenging task. In this thesis, we aim to build a 3D hand pose estimation system which can correctly detect human hand and accurately estimate its pose using depth images. To guarantee the robustness of our system, we design a hand model called spherical part model (SPM), and train a deep convolutional neural network using this model. Moreover, to reduce the influence of human’s omissions, we use a data-driven approach to integrate them together. As a result, our network can more accurately estimate hand pose based on prior knowledge of human hand. To demonstrate the superiority of our method, a complete experiment is conducted on two public and one self-build datasets. The results show that our system can detect human hands with almost 90% in average precision and achieve about 10 millimeters of average error distance on pose estimation that much outperforms other state of the art works.