Reinventing 2D Convolutions for 3D Images

There have been considerable debates over 2D and 3D representation learning on 3D medical images. 2D approaches could benefit from large-scale 2D pretraining, whereas they are generally weak in capturing large 3D contexts. 3D approaches are natively strong in 3D contexts, however few publicly availa...

Full description

Bibliographic Details
Main Authors: He, Y. (Author), Huang, X. (Author), Ni, B. (Author), Xu, G. (Author), Xu, J. (Author), Yang, C. (Author), Yang, J. (Author)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02852nam a2200505Ia 4500
001 10.1109-JBHI.2021.3049452
008 220427s2021 CNT 000 0 und d
020 |a 21682194 (ISSN) 
245 1 0 |a Reinventing 2D Convolutions for 3D Images 
260 0 |b Institute of Electrical and Electronics Engineers Inc.  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1109/JBHI.2021.3049452 
520 3 |a There have been considerable debates over 2D and 3D representation learning on 3D medical images. 2D approaches could benefit from large-scale 2D pretraining, whereas they are generally weak in capturing large 3D contexts. 3D approaches are natively strong in 3D contexts, however few publicly available 3D medical dataset is large and diverse enough for universal 3D pretraining. Even for hybrid (2D + 3D) approaches, the intrinsic disadvantages within the 2D/3D parts still exist. In this study, we bridge the gap between 2D and 3D convolutions by reinventing the 2D convolutions. We propose ACS (axial-coronal-sagittal) convolutions to perform natively 3D representation learning, while utilizing the pretrained weights on 2D datasets. In ACS convolutions, 2D convolution kernels are split by channel into three parts, and convoluted separately on the three views (axial, coronal and sagittal) of 3D representations. Theoretically, ANY 2D CNN (ResNet, DenseNet, or DeepLab) is able to be converted into a 3D ACS CNN, with pretrained weight of a same parameter size. Extensive experiments validate the consistent superiority of the pretrained ACS CNNs, over the 2D/3D CNN counterparts with/without pretraining. Even without pretraining, the ACS convolution can be used as a plug-and-play replacement of standard 3D convolution, with smaller model size and less computation. © 2013 IEEE. 
650 0 4 |a 2-D convolution 
650 0 4 |a 2D-to-3D transfer learning 
650 0 4 |a 3D medical image 
650 0 4 |a 3D medical images 
650 0 4 |a 3d representations 
650 0 4 |a ACS convolutions 
650 0 4 |a algorithm 
650 0 4 |a Algorithms 
650 0 4 |a article 
650 0 4 |a Convolution 
650 0 4 |a deep learning 
650 0 4 |a deep learning 
650 0 4 |a Detection tasks 
650 0 4 |a human 
650 0 4 |a Humans 
650 0 4 |a Imaging, Three-Dimensional 
650 0 4 |a Large dataset 
650 0 4 |a Medical dataset 
650 0 4 |a Medical imaging 
650 0 4 |a Plug and play 
650 0 4 |a Pre-training 
650 0 4 |a Three views 
650 0 4 |a three-dimensional imaging 
650 0 4 |a transfer of learning 
650 0 4 |a videorecording 
700 1 |a He, Y.  |e author 
700 1 |a Huang, X.  |e author 
700 1 |a Ni, B.  |e author 
700 1 |a Xu, G.  |e author 
700 1 |a Xu, J.  |e author 
700 1 |a Yang, C.  |e author 
700 1 |a Yang, J.  |e author 
773 |t IEEE Journal of Biomedical and Health Informatics