Scale Adaptive Feature Pyramid Networks for 2D Object Detection

Object detection is one of the core tasks in computer vision. Object detection algorithms often have difficulty detecting objects with diverse scales, especially those with smaller scales. To cope with this issue, Lin et al. proposed feature pyramid networks (FPNs), which aim for a feature pyramid w...

Full description

Bibliographic Details
Main Authors:	Lifei He, Ming Jiang, Ryutarou Ohbuchi, Takahiko Furuya, Min Zhang, Pengfei Li
Format:	Article
Language:	English
Published:	Hindawi Limited 2020-01-01
Series:	Scientific Programming
Online Access:	http://dx.doi.org/10.1155/2020/8839979

id	doaj-c4c696471e7a4ec4a61b9e4ab6345b68
record_format	Article
spelling	doaj-c4c696471e7a4ec4a61b9e4ab6345b682021-07-02T13:13:40ZengHindawi LimitedScientific Programming1875-919X2020-01-01202010.1155/2020/8839979Scale Adaptive Feature Pyramid Networks for 2D Object DetectionLifei He0Ming Jiang1Ryutarou Ohbuchi2Takahiko Furuya3Min Zhang4Pengfei Li5School of Computer ScienceSchool of Computer ScienceDepartment of Computer Science and EngineeringDepartment of Computer Science and EngineeringSchool of Computer ScienceSchool of Computer ScienceObject detection is one of the core tasks in computer vision. Object detection algorithms often have difficulty detecting objects with diverse scales, especially those with smaller scales. To cope with this issue, Lin et al. proposed feature pyramid networks (FPNs), which aim for a feature pyramid with higher semantic content at every scale level. The FPN consists of a bottom-up pyramid and a top-down pyramid. The bottom-up pyramid is induced by a convolutional neural network as its layers of feature maps. The top-down pyramid is formed by progressive up-sampling of a highly semantic yet low-resolution feature map at the top of the bottom-up pyramid. At each up-sampling step, feature maps of the bottom-up pyramid are fused with the top-down pyramid to produce highly semantic yet high-resolution feature maps in the top-down pyramid. Despite significant improvement, the FPN still misses small-scale objects. To further improve the detection of small-scale objects, this paper proposes scale adaptive feature pyramid networks (SAFPNs). The SAFPN employs weights chosen adaptively to each input image in fusing feature maps of the bottom-up pyramid and top-down pyramid. Scale adaptive weights are computed by using a scale attention module built into the feature map fusion computation. The scale attention module is trained end-to-end to adapt to the scale of objects contained in images of the training dataset. Experimental evaluation, using both the 2-stage detector faster R-CNN and 1-stage detector RetinaNet, demonstrated the proposed approach’s effectiveness.http://dx.doi.org/10.1155/2020/8839979
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Lifei He Ming Jiang Ryutarou Ohbuchi Takahiko Furuya Min Zhang Pengfei Li
spellingShingle	Lifei He Ming Jiang Ryutarou Ohbuchi Takahiko Furuya Min Zhang Pengfei Li Scale Adaptive Feature Pyramid Networks for 2D Object Detection Scientific Programming
author_facet	Lifei He Ming Jiang Ryutarou Ohbuchi Takahiko Furuya Min Zhang Pengfei Li
author_sort	Lifei He
title	Scale Adaptive Feature Pyramid Networks for 2D Object Detection
title_short	Scale Adaptive Feature Pyramid Networks for 2D Object Detection
title_full	Scale Adaptive Feature Pyramid Networks for 2D Object Detection
title_fullStr	Scale Adaptive Feature Pyramid Networks for 2D Object Detection
title_full_unstemmed	Scale Adaptive Feature Pyramid Networks for 2D Object Detection
title_sort	scale adaptive feature pyramid networks for 2d object detection
publisher	Hindawi Limited
series	Scientific Programming
issn	1875-919X
publishDate	2020-01-01
description	Object detection is one of the core tasks in computer vision. Object detection algorithms often have difficulty detecting objects with diverse scales, especially those with smaller scales. To cope with this issue, Lin et al. proposed feature pyramid networks (FPNs), which aim for a feature pyramid with higher semantic content at every scale level. The FPN consists of a bottom-up pyramid and a top-down pyramid. The bottom-up pyramid is induced by a convolutional neural network as its layers of feature maps. The top-down pyramid is formed by progressive up-sampling of a highly semantic yet low-resolution feature map at the top of the bottom-up pyramid. At each up-sampling step, feature maps of the bottom-up pyramid are fused with the top-down pyramid to produce highly semantic yet high-resolution feature maps in the top-down pyramid. Despite significant improvement, the FPN still misses small-scale objects. To further improve the detection of small-scale objects, this paper proposes scale adaptive feature pyramid networks (SAFPNs). The SAFPN employs weights chosen adaptively to each input image in fusing feature maps of the bottom-up pyramid and top-down pyramid. Scale adaptive weights are computed by using a scale attention module built into the feature map fusion computation. The scale attention module is trained end-to-end to adapt to the scale of objects contained in images of the training dataset. Experimental evaluation, using both the 2-stage detector faster R-CNN and 1-stage detector RetinaNet, demonstrated the proposed approach’s effectiveness.
url	http://dx.doi.org/10.1155/2020/8839979
work_keys_str_mv	AT lifeihe scaleadaptivefeaturepyramidnetworksfor2dobjectdetection AT mingjiang scaleadaptivefeaturepyramidnetworksfor2dobjectdetection AT ryutarouohbuchi scaleadaptivefeaturepyramidnetworksfor2dobjectdetection AT takahikofuruya scaleadaptivefeaturepyramidnetworksfor2dobjectdetection AT minzhang scaleadaptivefeaturepyramidnetworksfor2dobjectdetection AT pengfeili scaleadaptivefeaturepyramidnetworksfor2dobjectdetection
_version_	1721329155733192704

Scale Adaptive Feature Pyramid Networks for 2D Object Detection

Similar Items