Summary: | Deep convolutional neural network (CNN) has been showing excellent performance in computer vision applications, including image restoration and synthesis. This is mainly because the deep CNN features usually have more powerful representation ability than classic features. However, to achieve better performance, we still have to further enhance deep features with advanced CNN models by considering the specific domain knowledge. In this dissertation, we mainly focus on designing efficient deep CNN models for image restoration and synthesis. We first present a residual dense network (RDN) for image restoration by learning hierarchical features. From low-quality input, our RDN obtains residual information, which is essentially important to recover high-quality result. We then try to enhance more informative deep CNN features with various attention mechanisms. Specifically, we propose a residual in residual (RIR) structure to get very deep CNN features, which are adaptively rescaled with our channel attention. We further design residual local and non-local attention blocks to extract features, which capture long-range dependencies between pixels. On the other hand, we investigate the deep CNN features in image synthesis, like style transfer and texture hallucination. We propose a flexible and general multimodal style transfer. By visualizing the deep style features, we introduce multimodal style representation, which is achieved with clustering. We then propose multimodal style matching, where we match the clustered sub-style components with local content features under a graph cut formulation. Besides, we investigate image synthesis about texture hallucination with large scaling factor. We propose an efficient high-resolution hallucination network for very large scaling factors. In summary, this dissertation studies efficient deep CNN models for high-quality image restoration and synthesis. Our proposed deep CNN models have shown promising performance in a wide range of computer vision applications, such as image super-resolution, denoising, deblurring, demosaicing, compression artifacts reduction, neural style transfer, and texture hallucination.--Author's abstract
|