Multi-Level Wavelet Convolutional Neural Networks
In computer vision, convolutional networks (CNNs) often adopt pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dila...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8732332/ |
id |
doaj-3ca370210ca749adb5d059eeef768616 |
---|---|
record_format |
Article |
spelling |
doaj-3ca370210ca749adb5d059eeef7686162021-03-29T23:42:56ZengIEEEIEEE Access2169-35362019-01-017749737498510.1109/ACCESS.2019.29214518732332Multi-Level Wavelet Convolutional Neural NetworksPengju Liu0https://orcid.org/0000-0001-8413-9621Hongzhi Zhang1Wei Lian2Wangmeng Zuo3School of Computer Science and Technology, Harbin Institute of Technology, Harbin, ChinaSchool of Computer Science and Technology, Harbin Institute of Technology, Harbin, ChinaDepartment of Computer Science, Changzhi University, Changzhi, ChinaSchool of Computer Science and Technology, Harbin Institute of Technology, Harbin, ChinaIn computer vision, convolutional networks (CNNs) often adopt pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dilated filter has been proposed to tradeoff between receptive field size and efficiency. But the accompanying gridding effect can cause a sparse sampling of input images with checkerboard patterns. To address this problem, in this paper, we propose a novel multi-level wavelet CNN (MWCNN) model to achieve a better tradeoff between receptive field size and computational efficiency. The core idea is to embed wavelet transform into CNN architecture to reduce the resolution of feature maps while at the same time, increasing receptive field. Specifically, MWCNN for image restoration is based on U-Net architecture, and inverse wavelet transform (IWT) is deployed to reconstruct the high resolution (HR) feature maps. The proposed MWCNN can also be viewed as an improvement of dilated filter and a generalization of average pooling and can be applied to not only image restoration tasks, but also any CNNs requiring a pooling operation. The experimental results demonstrate the effectiveness of the proposed MWCNN for tasks, such as image denoising, single image super-resolution, JPEG image artifacts removal and object classification.https://ieeexplore.ieee.org/document/8732332/Convolutional networksreceptive field sizeefficiencymulti-level wavelet |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Pengju Liu Hongzhi Zhang Wei Lian Wangmeng Zuo |
spellingShingle |
Pengju Liu Hongzhi Zhang Wei Lian Wangmeng Zuo Multi-Level Wavelet Convolutional Neural Networks IEEE Access Convolutional networks receptive field size efficiency multi-level wavelet |
author_facet |
Pengju Liu Hongzhi Zhang Wei Lian Wangmeng Zuo |
author_sort |
Pengju Liu |
title |
Multi-Level Wavelet Convolutional Neural Networks |
title_short |
Multi-Level Wavelet Convolutional Neural Networks |
title_full |
Multi-Level Wavelet Convolutional Neural Networks |
title_fullStr |
Multi-Level Wavelet Convolutional Neural Networks |
title_full_unstemmed |
Multi-Level Wavelet Convolutional Neural Networks |
title_sort |
multi-level wavelet convolutional neural networks |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
In computer vision, convolutional networks (CNNs) often adopt pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dilated filter has been proposed to tradeoff between receptive field size and efficiency. But the accompanying gridding effect can cause a sparse sampling of input images with checkerboard patterns. To address this problem, in this paper, we propose a novel multi-level wavelet CNN (MWCNN) model to achieve a better tradeoff between receptive field size and computational efficiency. The core idea is to embed wavelet transform into CNN architecture to reduce the resolution of feature maps while at the same time, increasing receptive field. Specifically, MWCNN for image restoration is based on U-Net architecture, and inverse wavelet transform (IWT) is deployed to reconstruct the high resolution (HR) feature maps. The proposed MWCNN can also be viewed as an improvement of dilated filter and a generalization of average pooling and can be applied to not only image restoration tasks, but also any CNNs requiring a pooling operation. The experimental results demonstrate the effectiveness of the proposed MWCNN for tasks, such as image denoising, single image super-resolution, JPEG image artifacts removal and object classification. |
topic |
Convolutional networks receptive field size efficiency multi-level wavelet |
url |
https://ieeexplore.ieee.org/document/8732332/ |
work_keys_str_mv |
AT pengjuliu multilevelwaveletconvolutionalneuralnetworks AT hongzhizhang multilevelwaveletconvolutionalneuralnetworks AT weilian multilevelwaveletconvolutionalneuralnetworks AT wangmengzuo multilevelwaveletconvolutionalneuralnetworks |
_version_ |
1724189069427081216 |