GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection

Scene text detection (STD) is an irreplaceable step in a scene text reading system. It remains a more challenging task than general object detection since text objects are of arbitrary orientations and varying sizes. Generally, segmentation methods that use U-Net or hourglass-like networks are the m...

Full description

Bibliographic Details
Main Authors:	Meng Cao, Yuexian Zou, Dongming Yang, Chao Liu
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Scene text detection multi-oriented text segmentation network contextual attention gradient vanishing/exploding problems
Online Access:	https://ieeexplore.ieee.org/document/8709682/

id	doaj-b34fca0f6a6e4983aa41dd9125208aff
record_format	Article
spelling	doaj-b34fca0f6a6e4983aa41dd9125208aff2021-03-29T22:58:49ZengIEEEIEEE Access2169-35362019-01-017628056281610.1109/ACCESS.2019.29155138709682GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text DetectionMeng Cao0https://orcid.org/0000-0002-8946-4228Yuexian Zou1Dongming Yang2Chao Liu3ADSPLAB, School of Electronic and Computer Engineering, Peking University, Shenzhen, ChinaADSPLAB, School of Electronic and Computer Engineering, Peking University, Shenzhen, ChinaADSPLAB, School of Electronic and Computer Engineering, Peking University, Shenzhen, ChinaADSPLAB, School of Electronic and Computer Engineering, Peking University, Shenzhen, ChinaScene text detection (STD) is an irreplaceable step in a scene text reading system. It remains a more challenging task than general object detection since text objects are of arbitrary orientations and varying sizes. Generally, segmentation methods that use U-Net or hourglass-like networks are the mainstream approaches in multi-oriented text detection tasks. However, experience has shown that text-like objects in the complex background have high response values on the output feature map of U-Net, which leads to the severe false positive detection rate and degrades the STD performance. To tackle this issue, an adaptive soft attention mechanism called contextual attention module (CAM) is devised to integrate into U-Net to highlight salient areas and meanwhile retains more detail information. Besides, the gradient vanishing and exploding problems make U-Net more difficult to train because of the nonlinear deconvolution layer used in the up-sampling process. To facilitate the training process, a gradient-inductive module (GIM) is carefully designed to provide a linear bypass to make the gradient back-propagation process more stable. Accordingly, an end-to-end trainable Gradient-Inductive Segmentation network with Contextual Attention is proposed (GISCA). The experimental results on three public benchmarks have demonstrated that the proposed GISCA achieves the state-of-the-art results in terms of f-measure: 92.1%, 87.3%, and 81.4% for ICDAR 2013, ICDAR 2015, and MSRA TD500, respectively.https://ieeexplore.ieee.org/document/8709682/Scene text detectionmulti-oriented textsegmentation networkcontextual attentiongradient vanishing/exploding problems
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Meng Cao Yuexian Zou Dongming Yang Chao Liu
spellingShingle	Meng Cao Yuexian Zou Dongming Yang Chao Liu GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection IEEE Access Scene text detection multi-oriented text segmentation network contextual attention gradient vanishing/exploding problems
author_facet	Meng Cao Yuexian Zou Dongming Yang Chao Liu
author_sort	Meng Cao
title	GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection
title_short	GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection
title_full	GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection
title_fullStr	GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection
title_full_unstemmed	GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection
title_sort	gisca: gradient-inductive segmentation network with contextual attention for scene text detection
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2019-01-01
description	Scene text detection (STD) is an irreplaceable step in a scene text reading system. It remains a more challenging task than general object detection since text objects are of arbitrary orientations and varying sizes. Generally, segmentation methods that use U-Net or hourglass-like networks are the mainstream approaches in multi-oriented text detection tasks. However, experience has shown that text-like objects in the complex background have high response values on the output feature map of U-Net, which leads to the severe false positive detection rate and degrades the STD performance. To tackle this issue, an adaptive soft attention mechanism called contextual attention module (CAM) is devised to integrate into U-Net to highlight salient areas and meanwhile retains more detail information. Besides, the gradient vanishing and exploding problems make U-Net more difficult to train because of the nonlinear deconvolution layer used in the up-sampling process. To facilitate the training process, a gradient-inductive module (GIM) is carefully designed to provide a linear bypass to make the gradient back-propagation process more stable. Accordingly, an end-to-end trainable Gradient-Inductive Segmentation network with Contextual Attention is proposed (GISCA). The experimental results on three public benchmarks have demonstrated that the proposed GISCA achieves the state-of-the-art results in terms of f-measure: 92.1%, 87.3%, and 81.4% for ICDAR 2013, ICDAR 2015, and MSRA TD500, respectively.
topic	Scene text detection multi-oriented text segmentation network contextual attention gradient vanishing/exploding problems
url	https://ieeexplore.ieee.org/document/8709682/
work_keys_str_mv	AT mengcao giscagradientinductivesegmentationnetworkwithcontextualattentionforscenetextdetection AT yuexianzou giscagradientinductivesegmentationnetworkwithcontextualattentionforscenetextdetection AT dongmingyang giscagradientinductivesegmentationnetworkwithcontextualattentionforscenetextdetection AT chaoliu giscagradientinductivesegmentationnetworkwithcontextualattentionforscenetextdetection
_version_	1724190468739170304

GISCA: Gradient-Inductive Segmentation Network With Contextual Attention for Scene Text Detection

Similar Items