Weighted combination of per-frame recognition results for text recognition in a video stream

The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure...

Full description

Bibliographic Details
Main Authors:	O. Petrova, K. Bulatov, V.V. Arlazarov, V.L. Arlazarov
Format:	Article
Language:	English
Published:	Samara National Research University 2021-02-01
Series:	Компьютерная оптика
Subjects:	mobile ocr video stream anytime algorithms weighted combination ensemble methods
Online Access:	http://computeroptics.ru/KO/PDF/KO45-1/450110.pdf

id	doaj-b83a7be21b0e4757841898ca56219f31
record_format	Article
spelling	doaj-b83a7be21b0e4757841898ca56219f312021-02-27T14:42:47ZengSamara National Research UniversityКомпьютерная оптика0134-24522412-61792021-02-01451778910.18287/2412-6179-CO-795Weighted combination of per-frame recognition results for text recognition in a video streamO. Petrova0K. Bulatov1V.V. Arlazarov2V.L. Arlazarov3FRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, RussiaFRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, Russia; Moscow Institute of Physics and Technology (State University), Moscow, RussiaFRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, RussiaFRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, Russia; Moscow Institute of Physics and Technology (State University), Moscow, RussiaThe scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.http://computeroptics.ru/KO/PDF/KO45-1/450110.pdfmobile ocrvideo streamanytime algorithmsweighted combinationensemble methods
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	O. Petrova K. Bulatov V.V. Arlazarov V.L. Arlazarov
spellingShingle	O. Petrova K. Bulatov V.V. Arlazarov V.L. Arlazarov Weighted combination of per-frame recognition results for text recognition in a video stream Компьютерная оптика mobile ocr video stream anytime algorithms weighted combination ensemble methods
author_facet	O. Petrova K. Bulatov V.V. Arlazarov V.L. Arlazarov
author_sort	O. Petrova
title	Weighted combination of per-frame recognition results for text recognition in a video stream
title_short	Weighted combination of per-frame recognition results for text recognition in a video stream
title_full	Weighted combination of per-frame recognition results for text recognition in a video stream
title_fullStr	Weighted combination of per-frame recognition results for text recognition in a video stream
title_full_unstemmed	Weighted combination of per-frame recognition results for text recognition in a video stream
title_sort	weighted combination of per-frame recognition results for text recognition in a video stream
publisher	Samara National Research University
series	Компьютерная оптика
issn	0134-2452 2412-6179
publishDate	2021-02-01
description	The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.
topic	mobile ocr video stream anytime algorithms weighted combination ensemble methods
url	http://computeroptics.ru/KO/PDF/KO45-1/450110.pdf
work_keys_str_mv	AT opetrova weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream AT kbulatov weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream AT vvarlazarov weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream AT vlarlazarov weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream
_version_	1724247971112943616

Weighted combination of per-frame recognition results for text recognition in a video stream

Similar Items