Weighted combination of per-frame recognition results for text recognition in a video stream
The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Samara National Research University
2021-02-01
|
Series: | Компьютерная оптика |
Subjects: | |
Online Access: | http://computeroptics.ru/KO/PDF/KO45-1/450110.pdf |
id |
doaj-b83a7be21b0e4757841898ca56219f31 |
---|---|
record_format |
Article |
spelling |
doaj-b83a7be21b0e4757841898ca56219f312021-02-27T14:42:47ZengSamara National Research UniversityКомпьютерная оптика0134-24522412-61792021-02-01451778910.18287/2412-6179-CO-795Weighted combination of per-frame recognition results for text recognition in a video streamO. Petrova0K. Bulatov1V.V. Arlazarov2V.L. Arlazarov3FRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, RussiaFRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, Russia; Moscow Institute of Physics and Technology (State University), Moscow, RussiaFRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, RussiaFRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, Russia; Moscow Institute of Physics and Technology (State University), Moscow, RussiaThe scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.http://computeroptics.ru/KO/PDF/KO45-1/450110.pdfmobile ocrvideo streamanytime algorithmsweighted combinationensemble methods |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
O. Petrova K. Bulatov V.V. Arlazarov V.L. Arlazarov |
spellingShingle |
O. Petrova K. Bulatov V.V. Arlazarov V.L. Arlazarov Weighted combination of per-frame recognition results for text recognition in a video stream Компьютерная оптика mobile ocr video stream anytime algorithms weighted combination ensemble methods |
author_facet |
O. Petrova K. Bulatov V.V. Arlazarov V.L. Arlazarov |
author_sort |
O. Petrova |
title |
Weighted combination of per-frame recognition results for text recognition in a video stream |
title_short |
Weighted combination of per-frame recognition results for text recognition in a video stream |
title_full |
Weighted combination of per-frame recognition results for text recognition in a video stream |
title_fullStr |
Weighted combination of per-frame recognition results for text recognition in a video stream |
title_full_unstemmed |
Weighted combination of per-frame recognition results for text recognition in a video stream |
title_sort |
weighted combination of per-frame recognition results for text recognition in a video stream |
publisher |
Samara National Research University |
series |
Компьютерная оптика |
issn |
0134-2452 2412-6179 |
publishDate |
2021-02-01 |
description |
The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed. |
topic |
mobile ocr video stream anytime algorithms weighted combination ensemble methods |
url |
http://computeroptics.ru/KO/PDF/KO45-1/450110.pdf |
work_keys_str_mv |
AT opetrova weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream AT kbulatov weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream AT vvarlazarov weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream AT vlarlazarov weightedcombinationofperframerecognitionresultsfortextrecognitioninavideostream |
_version_ |
1724247971112943616 |