R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation

Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satis...

Full description

Bibliographic Details
Main Authors:	Xiqi Wang, Shunyi Zheng, Ce Zhang, Rui Li, Li Gui
Format:	Article
Language:	English
Published:	MDPI AG 2021-01-01
Series:	Sensors
Subjects:	scene text detection arbitrarily-oriented text rotation anchor convolutional neural network YOLOv4
Online Access:	https://www.mdpi.com/1424-8220/21/3/888

id	doaj-c437fcb2146845408daae2a4a72bfccc
record_format	Article
spelling	doaj-c437fcb2146845408daae2a4a72bfccc2021-01-29T00:06:03ZengMDPI AGSensors1424-82202021-01-012188888810.3390/s21030888R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary RotationXiqi Wang0Shunyi Zheng1Ce Zhang2Rui Li3Li Gui4School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaSchool of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaLancaster Environment Centre, Lancaster University, Lancaster LA1 4YQ, UKSchool of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaSchool of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaAccurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset.https://www.mdpi.com/1424-8220/21/3/888scene text detectionarbitrarily-oriented textrotation anchorconvolutional neural networkYOLOv4
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Xiqi Wang Shunyi Zheng Ce Zhang Rui Li Li Gui
spellingShingle	Xiqi Wang Shunyi Zheng Ce Zhang Rui Li Li Gui R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation Sensors scene text detection arbitrarily-oriented text rotation anchor convolutional neural network YOLOv4
author_facet	Xiqi Wang Shunyi Zheng Ce Zhang Rui Li Li Gui
author_sort	Xiqi Wang
title	R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_short	R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_full	R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_fullStr	R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_full_unstemmed	R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation
title_sort	r-yolo: a real-time text detector for natural scenes with arbitrary rotation
publisher	MDPI AG
series	Sensors
issn	1424-8220
publishDate	2021-01-01
description	Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset.
topic	scene text detection arbitrarily-oriented text rotation anchor convolutional neural network YOLOv4
url	https://www.mdpi.com/1424-8220/21/3/888
work_keys_str_mv	AT xiqiwang ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation AT shunyizheng ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation AT cezhang ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation AT ruili ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation AT ligui ryoloarealtimetextdetectorfornaturalsceneswitharbitraryrotation
_version_	1724319094363127808

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation

Similar Items