Writer identification in medieval and modern handwriting

Writer identification is the task of associating a handwriting sample with the identity of the correct writer. It can be used to confirm or refute the authenticity of a document, or to link together documents produced by the same writer. This problem has applications in several areas, including fore...

Full description

Bibliographic Details
Main Author: Gilliam, Tara
Other Authors: Clark, John A. ; Wilson, Richard C.
Published: University of York 2011
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.556252
Description
Summary:Writer identification is the task of associating a handwriting sample with the identity of the correct writer. It can be used to confirm or refute the authenticity of a document, or to link together documents produced by the same writer. This problem has applications in several areas, including forensics and palaeography -- the study of historical books and writings. Rigorous manual writer identification requires the exhaustive comparison of character details, and is very time-consuming, making computer automation of all or part of this process attractive. Most research into automated writer identification has originated in forensic science, although more recently applications to historical texts are increasing. With mass digitisation of texts on the rise in libraries and collections, organising this new data is a growing problem. However, different types of writing have different characteristics, and require different handling. This thesis focuses on how medieval English manuscripts from the 14th--15th centuries compare to the contemporary handwriting datasets used for much of the research and feature development in this area. The work presented here is based on an in-depth application of the grapheme codebook approach to offline writer identification. It finds domain-specific considerations throughout the process, particularly in grapheme creation and comparison and in the influence of document sources on system accuracy. Additionally, over the course of the data analysis, methods are proposed for the visualisation of extracted features, for quantifying the impact of sample source on identification accuracy, and for a nearest-neighbour-based verification system.