A Survey of Graphical Page Object Detection with Deep Neural Networks

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document...

Full description

Bibliographic Details
Main Authors: Jwalin Bhatt, Khurram Azeem Hashmi, Muhammad Zeshan Afzal, Didier Stricker
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/12/5344
id doaj-f0f6fa57c9ab4260b6c94da4d0e1db2a
record_format Article
spelling doaj-f0f6fa57c9ab4260b6c94da4d0e1db2a2021-06-30T23:41:23ZengMDPI AGApplied Sciences2076-34172021-06-01115344534410.3390/app11125344A Survey of Graphical Page Object Detection with Deep Neural NetworksJwalin Bhatt0Khurram Azeem Hashmi1Muhammad Zeshan Afzal2Didier Stricker3Department of Computer Science, Technical University, 67663 Kaiserslautern, GermanyDepartment of Computer Science, Technical University, 67663 Kaiserslautern, GermanyDepartment of Computer Science, Technical University, 67663 Kaiserslautern, GermanyDepartment of Computer Science, Technical University, 67663 Kaiserslautern, GermanyIn any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that make the digitization of documents viable. Since the advent of deep learning, deep learning-based object detection performance has improved many folds. This work outlines and summarizes the deep learning approaches for detecting graphical page objects in document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.https://www.mdpi.com/2076-3417/11/12/5344deep neural networkdocument imagesreview paperdeep learningperformance evaluationpage object detection
collection DOAJ
language English
format Article
sources DOAJ
author Jwalin Bhatt
Khurram Azeem Hashmi
Muhammad Zeshan Afzal
Didier Stricker
spellingShingle Jwalin Bhatt
Khurram Azeem Hashmi
Muhammad Zeshan Afzal
Didier Stricker
A Survey of Graphical Page Object Detection with Deep Neural Networks
Applied Sciences
deep neural network
document images
review paper
deep learning
performance evaluation
page object detection
author_facet Jwalin Bhatt
Khurram Azeem Hashmi
Muhammad Zeshan Afzal
Didier Stricker
author_sort Jwalin Bhatt
title A Survey of Graphical Page Object Detection with Deep Neural Networks
title_short A Survey of Graphical Page Object Detection with Deep Neural Networks
title_full A Survey of Graphical Page Object Detection with Deep Neural Networks
title_fullStr A Survey of Graphical Page Object Detection with Deep Neural Networks
title_full_unstemmed A Survey of Graphical Page Object Detection with Deep Neural Networks
title_sort survey of graphical page object detection with deep neural networks
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-06-01
description In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that make the digitization of documents viable. Since the advent of deep learning, deep learning-based object detection performance has improved many folds. This work outlines and summarizes the deep learning approaches for detecting graphical page objects in document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.
topic deep neural network
document images
review paper
deep learning
performance evaluation
page object detection
url https://www.mdpi.com/2076-3417/11/12/5344
work_keys_str_mv AT jwalinbhatt asurveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT khurramazeemhashmi asurveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT muhammadzeshanafzal asurveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT didierstricker asurveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT jwalinbhatt surveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT khurramazeemhashmi surveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT muhammadzeshanafzal surveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
AT didierstricker surveyofgraphicalpageobjectdetectionwithdeepneuralnetworks
_version_ 1721350724293492736