Decision tree algorithms for handwritten digit recognition

We present an original algorithm for recognizing handwritten digits. We begin by introducing a virtually infinite collection of binary geometric features. The features are queries that ask if a particular geometric arrangement of local topographic codes is present in an image. The codes, which we ca...

Full description

Bibliographic Details
Main Author: Wilder, Kenneth Joseph
Language:ENG
Published: ScholarWorks@UMass Amherst 1998
Subjects:
Online Access:https://scholarworks.umass.edu/dissertations/AAI9823791
id ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-3021
record_format oai_dc
spelling ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-30212020-12-02T14:29:28Z Decision tree algorithms for handwritten digit recognition Wilder, Kenneth Joseph We present an original algorithm for recognizing handwritten digits. We begin by introducing a virtually infinite collection of binary geometric features. The features are queries that ask if a particular geometric arrangement of local topographic codes is present in an image. The codes, which we call "tags", are too coarse and common to be informative by themselves, but the presence of geometric arrangements of tags ("tag arrangements") can provide substantial information about the shape of an image. Tag arrangements are features that are well-suited for handwritten digit recognition as their presence in an image is unaffected by a large number of transformations that do not affect the class of the image. It is impossible to calculate all of the features in an image. We therefore use decision trees to simultaneously determine a small collection of informative features and construct a classifier. By only considering a small random sample of queries at each mode we are able to generate multiple, randomized trees that determine a more varied and informative collection of features than is possible with a single tree. The trees, which provide posterior estimates of the class probabilities, are aggregated to produce a stable and robust classifier. We analyze the performance of this method and propose several means of augmenting its performance. Most notably, we introduce a nearest neighbor final test that reduces the already low error rate an additional 20-30%. Testing was done on a subset of a National Institute of Standards and Technology database, and we report a classification rate of 99.6%, comparable to the top results reported elsewhere. 1998-01-01T08:00:00Z text https://scholarworks.umass.edu/dissertations/AAI9823791 Doctoral Dissertations Available from Proquest ENG ScholarWorks@UMass Amherst Mathematics|Electrical engineering|Artificial intelligence|Computer science
collection NDLTD
language ENG
sources NDLTD
topic Mathematics|Electrical engineering|Artificial intelligence|Computer science
spellingShingle Mathematics|Electrical engineering|Artificial intelligence|Computer science
Wilder, Kenneth Joseph
Decision tree algorithms for handwritten digit recognition
description We present an original algorithm for recognizing handwritten digits. We begin by introducing a virtually infinite collection of binary geometric features. The features are queries that ask if a particular geometric arrangement of local topographic codes is present in an image. The codes, which we call "tags", are too coarse and common to be informative by themselves, but the presence of geometric arrangements of tags ("tag arrangements") can provide substantial information about the shape of an image. Tag arrangements are features that are well-suited for handwritten digit recognition as their presence in an image is unaffected by a large number of transformations that do not affect the class of the image. It is impossible to calculate all of the features in an image. We therefore use decision trees to simultaneously determine a small collection of informative features and construct a classifier. By only considering a small random sample of queries at each mode we are able to generate multiple, randomized trees that determine a more varied and informative collection of features than is possible with a single tree. The trees, which provide posterior estimates of the class probabilities, are aggregated to produce a stable and robust classifier. We analyze the performance of this method and propose several means of augmenting its performance. Most notably, we introduce a nearest neighbor final test that reduces the already low error rate an additional 20-30%. Testing was done on a subset of a National Institute of Standards and Technology database, and we report a classification rate of 99.6%, comparable to the top results reported elsewhere.
author Wilder, Kenneth Joseph
author_facet Wilder, Kenneth Joseph
author_sort Wilder, Kenneth Joseph
title Decision tree algorithms for handwritten digit recognition
title_short Decision tree algorithms for handwritten digit recognition
title_full Decision tree algorithms for handwritten digit recognition
title_fullStr Decision tree algorithms for handwritten digit recognition
title_full_unstemmed Decision tree algorithms for handwritten digit recognition
title_sort decision tree algorithms for handwritten digit recognition
publisher ScholarWorks@UMass Amherst
publishDate 1998
url https://scholarworks.umass.edu/dissertations/AAI9823791
work_keys_str_mv AT wilderkennethjoseph decisiontreealgorithmsforhandwrittendigitrecognition
_version_ 1719363749542887424