3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions

In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Ne...

Full description

Bibliographic Details
Main Author: Gu, Dongfeng
Other Authors: Laganière, Robert
Language:en
Published: Université d'Ottawa / University of Ottawa 2017
Subjects:
CNN
Online Access:http://hdl.handle.net/10393/36739
http://dx.doi.org/10.20381/ruor-21013
id ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-36739
record_format oai_dc
spelling ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-367392018-01-05T19:03:09Z 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions Gu, Dongfeng Laganière, Robert Petriu, Emil 3D-DenseNet CNN In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset. 2017-10-03T16:57:03Z 2017-10-03T16:57:03Z 2017 Thesis http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013 en Université d'Ottawa / University of Ottawa
collection NDLTD
language en
sources NDLTD
topic 3D-DenseNet
CNN
spellingShingle 3D-DenseNet
CNN
Gu, Dongfeng
3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
description In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset.
author2 Laganière, Robert
author_facet Laganière, Robert
Gu, Dongfeng
author Gu, Dongfeng
author_sort Gu, Dongfeng
title 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_short 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_full 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_fullStr 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_full_unstemmed 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_sort 3d densely connected convolutional network for the recognition of human shopping actions
publisher Université d'Ottawa / University of Ottawa
publishDate 2017
url http://hdl.handle.net/10393/36739
http://dx.doi.org/10.20381/ruor-21013
work_keys_str_mv AT gudongfeng 3ddenselyconnectedconvolutionalnetworkfortherecognitionofhumanshoppingactions
_version_ 1718598960281550848