Generating an Ordered Data Set from an OCR Text File
This tutorial illustrates strategies for taking raw OCR output from a scanned text, parsing it to isolate and correct essential elements of metadata, and generating an ordered data set (a python dictionary) from it. These illustrations are specific to a particular text, but the overall strategy, and...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Editorial Board of the Programming Historian
2014-11-01
|
Series: | The Programming Historian |
Subjects: | |
Online Access: | http://programminghistorian.org/lessons/generating-an-ordered-data-set-from-an-OCR-text-file |