Multi-Level Wavelet Convolutional Neural Networks

In computer vision, convolutional networks (CNNs) often adopt pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dila...

Full description

Bibliographic Details
Main Authors: Pengju Liu, Hongzhi Zhang, Wei Lian, Wangmeng Zuo
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8732332/
id doaj-3ca370210ca749adb5d059eeef768616
record_format Article
spelling doaj-3ca370210ca749adb5d059eeef7686162021-03-29T23:42:56ZengIEEEIEEE Access2169-35362019-01-017749737498510.1109/ACCESS.2019.29214518732332Multi-Level Wavelet Convolutional Neural NetworksPengju Liu0https://orcid.org/0000-0001-8413-9621Hongzhi Zhang1Wei Lian2Wangmeng Zuo3School of Computer Science and Technology, Harbin Institute of Technology, Harbin, ChinaSchool of Computer Science and Technology, Harbin Institute of Technology, Harbin, ChinaDepartment of Computer Science, Changzhi University, Changzhi, ChinaSchool of Computer Science and Technology, Harbin Institute of Technology, Harbin, ChinaIn computer vision, convolutional networks (CNNs) often adopt pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dilated filter has been proposed to tradeoff between receptive field size and efficiency. But the accompanying gridding effect can cause a sparse sampling of input images with checkerboard patterns. To address this problem, in this paper, we propose a novel multi-level wavelet CNN (MWCNN) model to achieve a better tradeoff between receptive field size and computational efficiency. The core idea is to embed wavelet transform into CNN architecture to reduce the resolution of feature maps while at the same time, increasing receptive field. Specifically, MWCNN for image restoration is based on U-Net architecture, and inverse wavelet transform (IWT) is deployed to reconstruct the high resolution (HR) feature maps. The proposed MWCNN can also be viewed as an improvement of dilated filter and a generalization of average pooling and can be applied to not only image restoration tasks, but also any CNNs requiring a pooling operation. The experimental results demonstrate the effectiveness of the proposed MWCNN for tasks, such as image denoising, single image super-resolution, JPEG image artifacts removal and object classification.https://ieeexplore.ieee.org/document/8732332/Convolutional networksreceptive field sizeefficiencymulti-level wavelet
collection DOAJ
language English
format Article
sources DOAJ
author Pengju Liu
Hongzhi Zhang
Wei Lian
Wangmeng Zuo
spellingShingle Pengju Liu
Hongzhi Zhang
Wei Lian
Wangmeng Zuo
Multi-Level Wavelet Convolutional Neural Networks
IEEE Access
Convolutional networks
receptive field size
efficiency
multi-level wavelet
author_facet Pengju Liu
Hongzhi Zhang
Wei Lian
Wangmeng Zuo
author_sort Pengju Liu
title Multi-Level Wavelet Convolutional Neural Networks
title_short Multi-Level Wavelet Convolutional Neural Networks
title_full Multi-Level Wavelet Convolutional Neural Networks
title_fullStr Multi-Level Wavelet Convolutional Neural Networks
title_full_unstemmed Multi-Level Wavelet Convolutional Neural Networks
title_sort multi-level wavelet convolutional neural networks
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description In computer vision, convolutional networks (CNNs) often adopt pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dilated filter has been proposed to tradeoff between receptive field size and efficiency. But the accompanying gridding effect can cause a sparse sampling of input images with checkerboard patterns. To address this problem, in this paper, we propose a novel multi-level wavelet CNN (MWCNN) model to achieve a better tradeoff between receptive field size and computational efficiency. The core idea is to embed wavelet transform into CNN architecture to reduce the resolution of feature maps while at the same time, increasing receptive field. Specifically, MWCNN for image restoration is based on U-Net architecture, and inverse wavelet transform (IWT) is deployed to reconstruct the high resolution (HR) feature maps. The proposed MWCNN can also be viewed as an improvement of dilated filter and a generalization of average pooling and can be applied to not only image restoration tasks, but also any CNNs requiring a pooling operation. The experimental results demonstrate the effectiveness of the proposed MWCNN for tasks, such as image denoising, single image super-resolution, JPEG image artifacts removal and object classification.
topic Convolutional networks
receptive field size
efficiency
multi-level wavelet
url https://ieeexplore.ieee.org/document/8732332/
work_keys_str_mv AT pengjuliu multilevelwaveletconvolutionalneuralnetworks
AT hongzhizhang multilevelwaveletconvolutionalneuralnetworks
AT weilian multilevelwaveletconvolutionalneuralnetworks
AT wangmengzuo multilevelwaveletconvolutionalneuralnetworks
_version_ 1724189069427081216