Summary: | 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 103 === Vision-based mid-air handwriting recognition can be applied to many applications. For example, it can be used to control Apple’s iTV or serve as a text editor in motion-sensing game. This is also a challenging task. Unlike pen tablet can determine when system records trajectory or output result by touching panel, this paper have to design this two events for mid-air handwriting recognition system. Besides, the same words which are written by different users have different scale and style.
This paper uses depth information via Kinect and uses OpenNI to extract human skeleton. In this paper, users do not need to use any gestures to start writing. When users finish writing, they just stay their hands in the end point of the word for one second to start recognition process. Owing to the design, the system must cause redundant trajectory. Because redundant trajectory is usually in the beginning of entire trajectory, we use architecture of backward combing segments to solve the problem of redundant trajectory. First of all, we use a method of detecting turning point to segment the trajectory. Then, we backward combine through segments from the last to first. Every combining segment is normalized to a uniform scale. Finally, we extract turning points, haar wavelet, dynamic time ordered shape context, and global time ordered shape context using dynamic time warping to match this features with database. Because some sub combining segments may be similar to some classes in the combing process, it may cause the system to find a wrong result. In general, the more the system combines segments, the more similar combined segment is to be. So this paper uses this property to collocate with a series of incremental weights. This paper uses a hierarchical classification to match with database more effectively so as to progressively find out optimal combined segment and class.
|