Summary: | This paper aims at tackling the task of fusion feature from images and their corresponding point clouds for 3D object detection in autonomous driving scenarios based on AVOD, an Aggregate View Object Detection network. The proposed fusion algorithms fuse features targeted from Bird’s Eye View (BEV) LIDAR point clouds and their corresponding RGB images. Differing in existing fusion methods, which are simply the adoption of the concatenation module, the element-wise sum module or the element-wise mean module, our proposed fusion algorithms enhance the interaction between BEV feature maps and their corresponding image feature maps by designing a novel structure, where single level feature maps and utilize multilevel feature maps. Experiments show that our proposed fusion algorithm produces better results on 3D mAP and AHS with less speed loss compared to the existing fusion method used on the KITTI 3D object detection benchmark.
|