Summary: | 碩士 === 國立清華大學 === 資訊系統與應用研究所 === 103 === In this thesis, we propose a novel 3D hand skeleton estimation system that can estimate the positions of hand joints from a single depth image. The hand skeleton estimation can be widely used in the fields of Human Computer Interface (HCI) and gesture recognition. Numerous researches on depth sensors have been endeavored with applications in these domains. However, the monotonous skin color, self-occlusions, view variations and high degree of freedom are the difficulties for 3D hand skeleton model estimation and gesture recognition from color or RGBD images. Currently, the model-based and discriminative methods have been proposed for solving these problems. In this work, we combine the vision-based learning and Active Shape Model approaches for 3D hand skeleton estimation from a single depth image.
The proposed approach is decomposed into two principal steps: the first part is to select the corresponding ASM model from the depth image, and the second part uses the skeleton-based ASM to iteratively refine the joint positions. In the training phase, we first build a random forest from a dataset of annotated hand depth images, which are clustered via K-means algorithm. With the random forest, we can compute the probability of each cluster for the input data point. Meanwhile, the PCA skeleton models and joints profile models are constructed for each cluster. Thus, for an input hand depth image, the system first determines the associated ASM model from random forest and then estimates the 3D hand skeleton model with a modified ASM fitting process. Our experiments demonstrate the effective 3D hand skeleton estimation by using the proposed algorithm for quantitative evaluations.
|