Contextual pattern recognition with applications to biomedical image identification

This thesis studies two rather distinct topics: one is the incorporation of contextual information in pattern recognition, with applications to biomedical image identification; and the other is the theoretical modeling of learning and generalization in the regime of machine learning. In Part I of...

Full description

Bibliographic Details
Main Author: Song, Xubo
Format: Others
Published: 1999
Online Access:https://thesis.library.caltech.edu/3690/1/Song_x_1999.pdf
Song, Xubo (1999) Contextual pattern recognition with applications to biomedical image identification. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/F5YK-HM52. https://resolver.caltech.edu/CaltechETD:etd-09222005-111015 <https://resolver.caltech.edu/CaltechETD:etd-09222005-111015>
Description
Summary:This thesis studies two rather distinct topics: one is the incorporation of contextual information in pattern recognition, with applications to biomedical image identification; and the other is the theoretical modeling of learning and generalization in the regime of machine learning. In Part I of the thesis, we propose techniques to incorporate contextual information into object classification. In the real world there are cases where the identity of an object is ambiguous due to the noise in the measurements based on which the classification should be made. It is helpful to reduce the ambiguity by utilizing extra information referred to as context, which in our case is the identities of the accompanying objects. We investigate the incorporation of both full and partial context. Their error probabilities, in terms of both set-by-set error and element-by-element error, are established and compared to context-free approach. The computational cost is studied in detail for full context, partial context and context-free cases. The techniques are applied to toy problems as well as real world problems such as white blood cell image classification and microscopic urinalysis. It is demonstrated that superior classification performance is achieved by using context. In our particular application, it reduces overall classification error, as well as false positive and false negative diagnosis rates. In Part II of the thesis, we propose a novel theoretical framework, called the Bin Model, for learning and generalization. Using the Bin Model, a closed form is derived for generalization that estimates the out-of-sample performance in terms of the in-sample performance. We address the problems of overfitting, and characterize conditions under which it does not appear. The effect of noise on generalization is studied, and the generalization of the Bin Model framework from classification problems to regression problems is discussed.