Born Broken: Fonts and Information Loss in Legacy Digital Documents

For millions of legacy documents, correct rendering depends upon resources such as fonts that are not generally embedded within the document structure. Yet there is a significant risk of information loss due to missing or incorrectly substituted fonts. Large document collections depend on thousands...

Full description

Bibliographic Details
Main Authors: Geoffrey Brown, Kam Woods
Format: Article
Language:English
Published: University of Edinburgh 2011-03-01
Series:International Journal of Digital Curation
Online Access:http://www.ijdc.net/index.php/ijdc/article/view/159
id doaj-4657a72b954f41e1a84cb742a1414d37
record_format Article
spelling doaj-4657a72b954f41e1a84cb742a1414d372020-11-24T21:13:24ZengUniversity of EdinburghInternational Journal of Digital Curation1746-82562011-03-016151910.2218/ijdc.v6i1.168151Born Broken: Fonts and Information Loss in Legacy Digital DocumentsGeoffrey BrownKam WoodsFor millions of legacy documents, correct rendering depends upon resources such as fonts that are not generally embedded within the document structure. Yet there is a significant risk of information loss due to missing or incorrectly substituted fonts. Large document collections depend on thousands of unique fonts not available on a common desktop workstation, which typically has between 100 and 200 fonts. Silent substitution of fonts, performed by applications such as Microsoft Office, can yield poorly rendered documents. In this paper we use a collection of 230,000 Word documents to assess the difficulty of matching font requirements with a database of fonts. We describe the identifying information contained in common font formats, font requirements stored in Word documents, the API provided by Windows to support font requests by applications, the documented substitution algorithms used by Windows when requested fonts are not available, and the ways in which support software might be used to control font substitution in a preservation environment.http://www.ijdc.net/index.php/ijdc/article/view/159
collection DOAJ
language English
format Article
sources DOAJ
author Geoffrey Brown
Kam Woods
spellingShingle Geoffrey Brown
Kam Woods
Born Broken: Fonts and Information Loss in Legacy Digital Documents
International Journal of Digital Curation
author_facet Geoffrey Brown
Kam Woods
author_sort Geoffrey Brown
title Born Broken: Fonts and Information Loss in Legacy Digital Documents
title_short Born Broken: Fonts and Information Loss in Legacy Digital Documents
title_full Born Broken: Fonts and Information Loss in Legacy Digital Documents
title_fullStr Born Broken: Fonts and Information Loss in Legacy Digital Documents
title_full_unstemmed Born Broken: Fonts and Information Loss in Legacy Digital Documents
title_sort born broken: fonts and information loss in legacy digital documents
publisher University of Edinburgh
series International Journal of Digital Curation
issn 1746-8256
publishDate 2011-03-01
description For millions of legacy documents, correct rendering depends upon resources such as fonts that are not generally embedded within the document structure. Yet there is a significant risk of information loss due to missing or incorrectly substituted fonts. Large document collections depend on thousands of unique fonts not available on a common desktop workstation, which typically has between 100 and 200 fonts. Silent substitution of fonts, performed by applications such as Microsoft Office, can yield poorly rendered documents. In this paper we use a collection of 230,000 Word documents to assess the difficulty of matching font requirements with a database of fonts. We describe the identifying information contained in common font formats, font requirements stored in Word documents, the API provided by Windows to support font requests by applications, the documented substitution algorithms used by Windows when requested fonts are not available, and the ways in which support software might be used to control font substitution in a preservation environment.
url http://www.ijdc.net/index.php/ijdc/article/view/159
work_keys_str_mv AT geoffreybrown bornbrokenfontsandinformationlossinlegacydigitaldocuments
AT kamwoods bornbrokenfontsandinformationlossinlegacydigitaldocuments
_version_ 1716749296832872448