Temporal Learning in Video Data Using Deep Learning and Gaussian Processes

This paper presents an approach for data-driven modeling of hidden, stationary temporal dynamics in sequential images or videos using deep learning and Bayesian non-parametric techniques. In particular, a deep Convolutional Neural Network (CNN) is used to extract spatial features in an unsupervised...

Full description

Bibliographic Details
Main Authors: Devesh K. Jha, Abhishek Srivastav, Asok Ray
Format: Article
Language:English
Published: The Prognostics and Health Management Society 2016-12-01
Series:International Journal of Prognostics and Health Management
Subjects:
Online Access:https://papers.phmsociety.org/index.php/ijphm/article/view/2460
Description
Summary:This paper presents an approach for data-driven modeling of hidden, stationary temporal dynamics in sequential images or videos using deep learning and Bayesian non-parametric techniques. In particular, a deep Convolutional Neural Network (CNN) is used to extract spatial features in an unsupervised fashion from individual images and then, a Gaussian process is used to model the temporal dynamics of the spatial features extracted by the deep CNN. By decomposing the spatial and temporal components and utilizing the strengths of deep learning and Gaussian processes for the respective sub-problems, we are able to construct a model that is able to capture complex spatio-temporal phenomena while using relatively small number of free parameters. The proposed approach is tested on high-speed grey-scale video data obtained of combustion flames in a swirl-stabilized combustor, where certain protocols are used to induce instability in combustion process. The proposed approach is then used to detect and predict the transition of the combustion process from stable to unstable regime. It is demonstrated that the proposed approach is able to detect unstable flame conditions using very few frames from high-speed video. This is useful as early detection of unstable combustion can lead to better control strategies to mitigate instability. Results from the proposed approach are compared and contrasted with several baselines and recent work in this area. The performance of the proposed approach is found to be significantly better in terms of detection accuracy, model complexity and lead-time to detection.
ISSN:2153-2648
2153-2648