Elaboration of the hierarchical approach to segmentation of scanned documents images

The object of research is the process of recognizing the areas of scanned documents images. The paper proposes a hierarchical approach to the segmentation of scanned documents images. This approach is an image of a scanned document in the form of a multi-level structure. At each level of the structu...

Full description

Bibliographic Details
Main Authors: Alesya Ishchenko, Vladyslav Zhuchkovskyi
Format: Article
Language:English
Published: PC Technology Center 2019-06-01
Series:Tehnologìčnij Audit ta Rezervi Virobnictva
Subjects:
Online Access:http://journals.uran.ua/tarp/article/view/173913
id doaj-5bf8fb68908a45aeb20bc1c2a0c4a02a
record_format Article
spelling doaj-5bf8fb68908a45aeb20bc1c2a0c4a02a2020-11-25T01:30:02ZengPC Technology CenterTehnologìčnij Audit ta Rezervi Virobnictva2226-37802312-83722019-06-0132(47)394210.15587/2312-8372.2019.173913173913Elaboration of the hierarchical approach to segmentation of scanned documents imagesAlesya Ishchenko0Vladyslav Zhuchkovskyi1Odessa National Polytechnic University, 1, Shevchenko ave., Odessa, Ukraine, 65044Odessa National Polytechnic University, 1, Shevchenko ave., Odessa, Ukraine, 65044The object of research is the process of recognizing the areas of scanned documents images. The paper proposes a hierarchical approach to the segmentation of scanned documents images. This approach is an image of a scanned document in the form of a multi-level structure. At each level of the structure, images containing structural regions are highlighted. Objects of the lower level strictly correlate with a certain area of the image of the upper level: areas of the photo and graphics correspond to the image containing the illustrations, and areas of text and background to the image containing both the text and the background at the same time. Using a hierarchical approach, it is possible to perform processing separately for each image area, namely: first, the areas of illustrations are highlighted on the original image of the scanned document using the analysis of connected components. Thus, the first level of the hierarchy consists of an image containing illustrations and an image containing text with a background. Then the areas of illustrations are divided into photos and graphics by splitting the areas of illustrations into blocks, and text areas are separated from the background using processing in the neighborhood of each pixel. Thus, the second level of the hierarchy is represented by images containing homogeneous areas: photos, graphics, text and background. The hierarchical approach to segmentation has reduced the processing time by an average of 80 times. The reduction in image processing time was due to the fact that at each level and in turn, in a separate part of the hierarchical structure, it was possible to take into account the structural features of a uniform image area corresponding to this level. And also choose the signs of identification of these areas with high computational efficiency, the use of which also reduced the processing time of the scanned document.http://journals.uran.ua/tarp/article/view/173913hierarchical approachscanned documentsimage
collection DOAJ
language English
format Article
sources DOAJ
author Alesya Ishchenko
Vladyslav Zhuchkovskyi
spellingShingle Alesya Ishchenko
Vladyslav Zhuchkovskyi
Elaboration of the hierarchical approach to segmentation of scanned documents images
Tehnologìčnij Audit ta Rezervi Virobnictva
hierarchical approach
scanned documents
image
author_facet Alesya Ishchenko
Vladyslav Zhuchkovskyi
author_sort Alesya Ishchenko
title Elaboration of the hierarchical approach to segmentation of scanned documents images
title_short Elaboration of the hierarchical approach to segmentation of scanned documents images
title_full Elaboration of the hierarchical approach to segmentation of scanned documents images
title_fullStr Elaboration of the hierarchical approach to segmentation of scanned documents images
title_full_unstemmed Elaboration of the hierarchical approach to segmentation of scanned documents images
title_sort elaboration of the hierarchical approach to segmentation of scanned documents images
publisher PC Technology Center
series Tehnologìčnij Audit ta Rezervi Virobnictva
issn 2226-3780
2312-8372
publishDate 2019-06-01
description The object of research is the process of recognizing the areas of scanned documents images. The paper proposes a hierarchical approach to the segmentation of scanned documents images. This approach is an image of a scanned document in the form of a multi-level structure. At each level of the structure, images containing structural regions are highlighted. Objects of the lower level strictly correlate with a certain area of the image of the upper level: areas of the photo and graphics correspond to the image containing the illustrations, and areas of text and background to the image containing both the text and the background at the same time. Using a hierarchical approach, it is possible to perform processing separately for each image area, namely: first, the areas of illustrations are highlighted on the original image of the scanned document using the analysis of connected components. Thus, the first level of the hierarchy consists of an image containing illustrations and an image containing text with a background. Then the areas of illustrations are divided into photos and graphics by splitting the areas of illustrations into blocks, and text areas are separated from the background using processing in the neighborhood of each pixel. Thus, the second level of the hierarchy is represented by images containing homogeneous areas: photos, graphics, text and background. The hierarchical approach to segmentation has reduced the processing time by an average of 80 times. The reduction in image processing time was due to the fact that at each level and in turn, in a separate part of the hierarchical structure, it was possible to take into account the structural features of a uniform image area corresponding to this level. And also choose the signs of identification of these areas with high computational efficiency, the use of which also reduced the processing time of the scanned document.
topic hierarchical approach
scanned documents
image
url http://journals.uran.ua/tarp/article/view/173913
work_keys_str_mv AT alesyaishchenko elaborationofthehierarchicalapproachtosegmentationofscanneddocumentsimages
AT vladyslavzhuchkovskyi elaborationofthehierarchicalapproachtosegmentationofscanneddocumentsimages
_version_ 1725094051523330048