Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks
This paper proposes a drop transformation networks (DTNs), a novel framework of learning transformation-invariant representations of images with good flexibility and generalization ability. Convolutional neural networks are a powerful end-to-end learning framework that can learn hierarchies of repre...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8397162/ |
id |
doaj-14e803b61cfa4ab6b4d9233356d23e16 |
---|---|
record_format |
Article |
spelling |
doaj-14e803b61cfa4ab6b4d9233356d23e162021-03-29T21:24:48ZengIEEEIEEE Access2169-35362018-01-016733577336910.1109/ACCESS.2018.28509658397162Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation NetworksChunxiao Fan0Yang Li1https://orcid.org/0000-0001-5848-3461Guijin Wang2https://orcid.org/0000-0002-2131-3044Yong Li3School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, ChinaSchool of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, ChinaDepartment of Electronic Engineering, Tsinghua University, Beijing, ChinaSchool of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, ChinaThis paper proposes a drop transformation networks (DTNs), a novel framework of learning transformation-invariant representations of images with good flexibility and generalization ability. Convolutional neural networks are a powerful end-to-end learning framework that can learn hierarchies of representations. Although the invariance to translation of the representations can be introduced by the approach of stacking convolutional and max-pooling layers, the approach is not effective in tackling other geometric transformations such as rotation and scale. Rotation and scale invariance are usually obtained through data augmentation, but this requires larger model size and more training time. DTN formulates transformation-invariant representations through explicitly manipulating geometric transformations within it. DTN applies multiple random transformations to its inputs but keeps only one output according to the given dropout policy. In this way, the complex dependencies of the knowledge on transformations contained in training data can be alleviated, and therefore the generalization to transformations is improved. Another advantage of DTN is the flexibility. Under the proposed framework, data augmentation can be seen as a special case. We evaluate DTN on three benchmark data sets and show that it can provide better performance with smaller number of parameters compared to state-of-the-art methods.https://ieeexplore.ieee.org/document/8397162/Convolutional neural networksdeep learningimage representationtransformation-invariance |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chunxiao Fan Yang Li Guijin Wang Yong Li |
spellingShingle |
Chunxiao Fan Yang Li Guijin Wang Yong Li Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks IEEE Access Convolutional neural networks deep learning image representation transformation-invariance |
author_facet |
Chunxiao Fan Yang Li Guijin Wang Yong Li |
author_sort |
Chunxiao Fan |
title |
Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks |
title_short |
Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks |
title_full |
Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks |
title_fullStr |
Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks |
title_full_unstemmed |
Learning Transformation-Invariant Representations for Image Recognition With Drop Transformation Networks |
title_sort |
learning transformation-invariant representations for image recognition with drop transformation networks |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2018-01-01 |
description |
This paper proposes a drop transformation networks (DTNs), a novel framework of learning transformation-invariant representations of images with good flexibility and generalization ability. Convolutional neural networks are a powerful end-to-end learning framework that can learn hierarchies of representations. Although the invariance to translation of the representations can be introduced by the approach of stacking convolutional and max-pooling layers, the approach is not effective in tackling other geometric transformations such as rotation and scale. Rotation and scale invariance are usually obtained through data augmentation, but this requires larger model size and more training time. DTN formulates transformation-invariant representations through explicitly manipulating geometric transformations within it. DTN applies multiple random transformations to its inputs but keeps only one output according to the given dropout policy. In this way, the complex dependencies of the knowledge on transformations contained in training data can be alleviated, and therefore the generalization to transformations is improved. Another advantage of DTN is the flexibility. Under the proposed framework, data augmentation can be seen as a special case. We evaluate DTN on three benchmark data sets and show that it can provide better performance with smaller number of parameters compared to state-of-the-art methods. |
topic |
Convolutional neural networks deep learning image representation transformation-invariance |
url |
https://ieeexplore.ieee.org/document/8397162/ |
work_keys_str_mv |
AT chunxiaofan learningtransformationinvariantrepresentationsforimagerecognitionwithdroptransformationnetworks AT yangli learningtransformationinvariantrepresentationsforimagerecognitionwithdroptransformationnetworks AT guijinwang learningtransformationinvariantrepresentationsforimagerecognitionwithdroptransformationnetworks AT yongli learningtransformationinvariantrepresentationsforimagerecognitionwithdroptransformationnetworks |
_version_ |
1724192962174255104 |