Summary: | 碩士 === 國立臺灣大學 === 電信工程學研究所 === 100 === To recognize temporally extended actions, it is useful to introduce high-order temporal dependence into the recognition task. However, this will highly increase the computational complexity, when the commonly used graphical models such as HMM and CRF are employed. In this thesis, multivariate linear prediction is proposed to exploit high-order temporal dependence with lower computational complexity. In addition, our method makes no effort on defining and manually labeling states and can improve bag-of-word representations, which may contain considerable noise but has shown excellent performance in previous work. To show the applicability of the proposed method, we experiment not only on video datasets including KTH and UCF but on skeleton datasets such as MSR 3D action and UCF Kinect. In most of them, our method gets superior performance than the state-of-the-art methods.
|