Self-supervised intrinsic image decomposition

Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data. In contrast to traditional fully supervised learning approaches, in this paper we propose learning intrinsic image decomposition by explaining the input image. O...

Full description

Bibliographic Details
Main Authors: Janner, Michael (Author), Wu, Jiajun (Author), Kulkarni, Tejas Dattatraya (Author), Yildirim, Ilker (Author), Tenenbaum, Joshua B (Author)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences (Contributor)
Format: Article
Language:English
Published: Neural Information Processing Systems Foundation, Inc., 2020-08-18T20:51:53Z.
Subjects:
Online Access:Get fulltext
LEADER 01914 am a22002173u 4500
001 126660
042 |a dc 
100 1 0 |a Janner, Michael  |e author 
100 1 0 |a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences  |e contributor 
700 1 0 |a Wu, Jiajun  |e author 
700 1 0 |a Kulkarni, Tejas Dattatraya  |e author 
700 1 0 |a Yildirim, Ilker  |e author 
700 1 0 |a Tenenbaum, Joshua B  |e author 
245 0 0 |a Self-supervised intrinsic image decomposition 
260 |b Neural Information Processing Systems Foundation, Inc.,   |c 2020-08-18T20:51:53Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/126660 
520 |a Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data. In contrast to traditional fully supervised learning approaches, in this paper we propose learning intrinsic image decomposition by explaining the input image. Our model, the Rendered Intrinsics Network (RIN), joins together an image decomposition pipeline, which predicts reflectance, shape, and lighting conditions given a single image, with a recombination function, a learned shading model used to recompose the original input based off of intrinsic image predictions. Our network can then use unsupervised reconstruction error as an additional signal to improve its intermediate representations. This allows large-scale unlabeled data to be useful during training, and also enables transferring learned knowledge to images of unseen object categories, lighting conditions, and shapes. Extensive experiments demonstrate that our method performs well on both intrinsic image decomposition and knowledge transfer. 
546 |a en 
655 7 |a Article 
773 |t Advances in Neural Information Processing Systems