Scene text detection via extremal region based double threshold convolutional network classification.

In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall...

Full description

Bibliographic Details
Main Authors:	Wei Zhu, Jing Lou, Longtao Chen, Qingyuan Xia, Mingwu Ren
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2017-01-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC5562312?pdf=render

id	doaj-8d0b06b7e6c5452d98da3f54d96854f0
record_format	Article
spelling	doaj-8d0b06b7e6c5452d98da3f54d96854f02020-11-24T20:45:06ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-01128e018222710.1371/journal.pone.0182227Scene text detection via extremal region based double threshold convolutional network classification.Wei ZhuJing LouLongtao ChenQingyuan XiaMingwu RenIn this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.http://europepmc.org/articles/PMC5562312?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Wei Zhu Jing Lou Longtao Chen Qingyuan Xia Mingwu Ren
spellingShingle	Wei Zhu Jing Lou Longtao Chen Qingyuan Xia Mingwu Ren Scene text detection via extremal region based double threshold convolutional network classification. PLoS ONE
author_facet	Wei Zhu Jing Lou Longtao Chen Qingyuan Xia Mingwu Ren
author_sort	Wei Zhu
title	Scene text detection via extremal region based double threshold convolutional network classification.
title_short	Scene text detection via extremal region based double threshold convolutional network classification.
title_full	Scene text detection via extremal region based double threshold convolutional network classification.
title_fullStr	Scene text detection via extremal region based double threshold convolutional network classification.
title_full_unstemmed	Scene text detection via extremal region based double threshold convolutional network classification.
title_sort	scene text detection via extremal region based double threshold convolutional network classification.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2017-01-01
description	In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.
url	http://europepmc.org/articles/PMC5562312?pdf=render
work_keys_str_mv	AT weizhu scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification AT jinglou scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification AT longtaochen scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification AT qingyuanxia scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification AT mingwuren scenetextdetectionviaextremalregionbaseddoublethresholdconvolutionalnetworkclassification
_version_	1716815467228692480

Scene text detection via extremal region based double threshold convolutional network classification.

Similar Items