Summary: | Transferring complex computing to the cloud server side leverages cloud-based intelligent service robots that are capable of highly complex computing tasks such as video analysis. In practical behavior surveillance applications, the captured videos from intelligent service robots are continuous. Action extraction from continuous unconstrained video is an important prerequisite for action analysis, such as action classification and recognition, abnormal event detection, and crowd emotion sensing. This paper proposes a novel approach for action extraction in continuous unconstrained video, which has three parts: spatial location estimation, temporal action path searching; and spatial-temporal action compensation. Spatial location estimation utilizes both human appearance and motion cues to obtain frame-level bounding boxes. Then, with the spatial action proposal results as a priori, the searching of temporal action paths is formulated as an optimal estimation problem by accounting for the missed detections and false alarms of the spatial location estimation. To solve the temporal action path searching problem, we propose the Markov Chain Monte Carlo algorithm and illustrate its convergence property. Extensive experiments on the challenging UCF-Sports and UCF-101 data sets show the effectiveness of our approach and obtain superior performance compared with the state-of-the-arts.
|