Human action recognition using deep probabilistic graphical models

Building intelligent systems that are capable of representing or extracting high-level representations from high-dimensional sensory data lies at the core of solving many A.I. related tasks. Human action recognition is an important topic in computer vision that lies in high-dimensional space. Its ap...

Full description

Bibliographic Details
Main Author: Wu, Di
Other Authors: Shao, Ling
Published: University of Sheffield 2014
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.617199
id ndltd-bl.uk-oai-ethos.bl.uk-617199
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-6171992017-10-04T03:25:00ZHuman action recognition using deep probabilistic graphical modelsWu, DiShao, Ling2014Building intelligent systems that are capable of representing or extracting high-level representations from high-dimensional sensory data lies at the core of solving many A.I. related tasks. Human action recognition is an important topic in computer vision that lies in high-dimensional space. Its applications include robotics, video surveillance, human-computer interaction, user interface design, and multi-media video retrieval amongst others. A number of approaches have been proposed to extract representative features from high-dimensional temporal data, most commonly hard wired geometric or bio-inspired shape context features. This thesis first demonstrates some \emph{ad-hoc} hand-crafted rules for effectively encoding motion features, and later elicits a more generic approach for incorporating structured feature learning and reasoning, \ie deep probabilistic graphical models. The hierarchial dynamic framework first extracts high level features and then uses the learned representation for estimating emission probability to infer action sequences. We show that better action recognition can be achieved by replacing gaussian mixture models by Deep Neural Networks that contain many layers of features to predict probability distributions over states of Markov Models. The framework can be easily extended to include an ergodic state to segment and recognise actions simultaneously. The first part of the thesis focuses on analysis and applications of hand-crafted features for human action representation and classification. We show that the ``hard coded" concept of correlogram can incorporate correlations between time domain sequences and we further investigate multi-modal inputs, \eg depth sensor input and its unique traits for action recognition. The second part of this thesis focuses on marrying probabilistic graphical models with Deep Neural Networks (both Deep Belief Networks and Deep 3D Convolutional Neural Networks) for structured sequence prediction. The proposed Deep Dynamic Neural Network exhibits its general framework for structured 2D data representation and classification. This inspires us to further investigate for applying various graphical models for time-variant video sequences.621.3University of Sheffieldhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.617199http://etheses.whiterose.ac.uk/6603/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 621.3
spellingShingle 621.3
Wu, Di
Human action recognition using deep probabilistic graphical models
description Building intelligent systems that are capable of representing or extracting high-level representations from high-dimensional sensory data lies at the core of solving many A.I. related tasks. Human action recognition is an important topic in computer vision that lies in high-dimensional space. Its applications include robotics, video surveillance, human-computer interaction, user interface design, and multi-media video retrieval amongst others. A number of approaches have been proposed to extract representative features from high-dimensional temporal data, most commonly hard wired geometric or bio-inspired shape context features. This thesis first demonstrates some \emph{ad-hoc} hand-crafted rules for effectively encoding motion features, and later elicits a more generic approach for incorporating structured feature learning and reasoning, \ie deep probabilistic graphical models. The hierarchial dynamic framework first extracts high level features and then uses the learned representation for estimating emission probability to infer action sequences. We show that better action recognition can be achieved by replacing gaussian mixture models by Deep Neural Networks that contain many layers of features to predict probability distributions over states of Markov Models. The framework can be easily extended to include an ergodic state to segment and recognise actions simultaneously. The first part of the thesis focuses on analysis and applications of hand-crafted features for human action representation and classification. We show that the ``hard coded" concept of correlogram can incorporate correlations between time domain sequences and we further investigate multi-modal inputs, \eg depth sensor input and its unique traits for action recognition. The second part of this thesis focuses on marrying probabilistic graphical models with Deep Neural Networks (both Deep Belief Networks and Deep 3D Convolutional Neural Networks) for structured sequence prediction. The proposed Deep Dynamic Neural Network exhibits its general framework for structured 2D data representation and classification. This inspires us to further investigate for applying various graphical models for time-variant video sequences.
author2 Shao, Ling
author_facet Shao, Ling
Wu, Di
author Wu, Di
author_sort Wu, Di
title Human action recognition using deep probabilistic graphical models
title_short Human action recognition using deep probabilistic graphical models
title_full Human action recognition using deep probabilistic graphical models
title_fullStr Human action recognition using deep probabilistic graphical models
title_full_unstemmed Human action recognition using deep probabilistic graphical models
title_sort human action recognition using deep probabilistic graphical models
publisher University of Sheffield
publishDate 2014
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.617199
work_keys_str_mv AT wudi humanactionrecognitionusingdeepprobabilisticgraphicalmodels
_version_ 1718543660283330560