3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions
In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Ne...
Main Author: | |
---|---|
Other Authors: | |
Language: | en |
Published: |
Université d'Ottawa / University of Ottawa
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013 |
id |
ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-36739 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-367392018-01-05T19:03:09Z 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions Gu, Dongfeng Laganière, Robert Petriu, Emil 3D-DenseNet CNN In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network. We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset. 2017-10-03T16:57:03Z 2017-10-03T16:57:03Z 2017 Thesis http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013 en Université d'Ottawa / University of Ottawa |
collection |
NDLTD |
language |
en |
sources |
NDLTD |
topic |
3D-DenseNet CNN |
spellingShingle |
3D-DenseNet CNN Gu, Dongfeng 3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions |
description |
In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network.
We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset. |
author2 |
Laganière, Robert |
author_facet |
Laganière, Robert Gu, Dongfeng |
author |
Gu, Dongfeng |
author_sort |
Gu, Dongfeng |
title |
3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions |
title_short |
3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions |
title_full |
3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions |
title_fullStr |
3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions |
title_full_unstemmed |
3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions |
title_sort |
3d densely connected convolutional network for the recognition of human shopping actions |
publisher |
Université d'Ottawa / University of Ottawa |
publishDate |
2017 |
url |
http://hdl.handle.net/10393/36739 http://dx.doi.org/10.20381/ruor-21013 |
work_keys_str_mv |
AT gudongfeng 3ddenselyconnectedconvolutionalnetworkfortherecognitionofhumanshoppingactions |
_version_ |
1718598960281550848 |