An Investigative and Goal driven Workbench for Text Extraction and Image Processing

Bibliographic Details
Main Author: Tumu, Sudheer
Language:English
Published: The Ohio State University / OhioLINK 2013
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=osu1376930066
id ndltd-OhioLink-oai-etd.ohiolink.edu-osu1376930066
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-osu13769300662021-08-03T06:19:24Z An Investigative and Goal driven Workbench for Text Extraction and Image Processing Tumu, Sudheer Computer Science Text data present in images and video provide useful information for indexing, annotation and structuring of images [1]. However, automatic extraction of text is extremely challenging because of variations in the source images such as contrast, complexity of background and as well as variations in the text to be extracted in style, size and orientation. This requires systematic experimentation where experiments are recorded, results are saved etc. Hence an “experimental workbench” that consists of various basic image processing and data analysis tools is needed to conduct an experiment or a series of experiments to achieve goals such as text extraction, basic image processing and to save intermediate/final results. This document presents the design and implementation of an experimental workbench that provides a collection of basic image processing and text extraction tools that an individual or an organization can use to perform various tasks such as extracting text from an image or a video. The transformations provided in the workbench are image to image transformations such as smoothing, dilation and erosion; image to text transformation such as optical character recognition (OCR); and text to text transformation such as fuzzy matching the extracted text using OCR, with an existing knowledge database to improve accuracy of extracted text. In addition to that, the workbench also provides support for automation and orchestration of existing tools. Users can create custom tools/transformations by combining existing tools and save intermediate results as checkpoints that can be used to roll back if necessary.The workbench was used to build an online library catalog by extracting book titles from a video stream of book spines. A custom transformation was created to perform this task, which is named as `hill-climbing’ that automates a series of basic image processing and text extraction tools. The video stream was recorded by holding the camera facing book spines and walking across the book shelf. The main contributions of our work are thus: providing an integrated collection of image processing operations, designing and developing a workbench of a basic image processing and text extraction tools from image or video, and using the proposed workbench to build an online library catalog from a video stream of book spines. 2013 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1376930066 http://rave.ohiolink.edu/etdc/view?acc_num=osu1376930066 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Tumu, Sudheer
An Investigative and Goal driven Workbench for Text Extraction and Image Processing
author Tumu, Sudheer
author_facet Tumu, Sudheer
author_sort Tumu, Sudheer
title An Investigative and Goal driven Workbench for Text Extraction and Image Processing
title_short An Investigative and Goal driven Workbench for Text Extraction and Image Processing
title_full An Investigative and Goal driven Workbench for Text Extraction and Image Processing
title_fullStr An Investigative and Goal driven Workbench for Text Extraction and Image Processing
title_full_unstemmed An Investigative and Goal driven Workbench for Text Extraction and Image Processing
title_sort investigative and goal driven workbench for text extraction and image processing
publisher The Ohio State University / OhioLINK
publishDate 2013
url http://rave.ohiolink.edu/etdc/view?acc_num=osu1376930066
work_keys_str_mv AT tumusudheer aninvestigativeandgoaldrivenworkbenchfortextextractionandimageprocessing
AT tumusudheer investigativeandgoaldrivenworkbenchfortextextractionandimageprocessing
_version_ 1719434687159468032