3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions

In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Ne...

Full description

Bibliographic Details
Main Author:	Gu, Dongfeng
Other Authors:	Laganière, Robert
Language:	en
Published:	Université d'Ottawa / University of Ottawa 2017
Subjects:	3D-DenseNet CNN
Online Access:	http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013

id	ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-36739
record_format	oai_dc
spelling	ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-367392018-01-05T19:03:09Z 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions Gu, Dongfeng Laganière, Robert Petriu, Emil 3D-DenseNet CNN In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset. 2017-10-03T16:57:03Z 2017-10-03T16:57:03Z 2017 Thesis http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013 en Université d'Ottawa / University of Ottawa
collection	NDLTD
language	en
sources	NDLTD
topic	3D-DenseNet CNN
spellingShingle	3D-DenseNet CNN Gu, Dongfeng 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
description	In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset.
author2	Laganière, Robert
author_facet	Laganière, Robert Gu, Dongfeng
author	Gu, Dongfeng
author_sort	Gu, Dongfeng
title	3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_short	3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_full	3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_fullStr	3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_full_unstemmed	3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
title_sort	3d densely connected convolutional network for the recognition of human shopping actions
publisher	Université d'Ottawa / University of Ottawa
publishDate	2017
url	http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013
work_keys_str_mv	AT gudongfeng 3ddenselyconnectedconvolutionalnetworkfortherecognitionofhumanshoppingactions
_version_	1718598960281550848

3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions

Similar Items