Feature analysis of diagrams with applications to retrieval and classification

Millions of diagrams are available online in scientific and technical documents. The knowledge contained in diagrams is a rich resource, but one that has been little exploited, in comparison to text. In this thesis we demonstrate that it is possible to efficiently build compact representations of di...

Full description

Bibliographic Details
Published:
Online Access:http://hdl.handle.net/2047/d20002794
id ndltd-NEU--neu-900
record_format oai_dc
spelling ndltd-NEU--neu-9002021-05-26T05:10:54ZFeature analysis of diagrams with applications to retrieval and classificationMillions of diagrams are available online in scientific and technical documents. The knowledge contained in diagrams is a rich resource, but one that has been little exploited, in comparison to text. In this thesis we demonstrate that it is possible to efficiently build compact representations of diagram content. The representations can support a wide variety of diagram-based systems such as retrieval, classification, and the building of knowledge bases that integrate text and diagrams. We demonstrate the strengths of our approach through studies of diagram retrieval as well as supervised and unsupervised machine learning for classification. The techniques are applied to a 700 diagram subset of more than 10,000 diagrams harvested from articles from the Open Access publisher, BioMed Central. A substantial set of Java-based tools was developed exclusively for this research. This will allow others to build on and extend what we have done.http://hdl.handle.net/2047/d20002794
collection NDLTD
sources NDLTD
description Millions of diagrams are available online in scientific and technical documents. The knowledge contained in diagrams is a rich resource, but one that has been little exploited, in comparison to text. In this thesis we demonstrate that it is possible to efficiently build compact representations of diagram content. The representations can support a wide variety of diagram-based systems such as retrieval, classification, and the building of knowledge bases that integrate text and diagrams. We demonstrate the strengths of our approach through studies of diagram retrieval as well as supervised and unsupervised machine learning for classification. The techniques are applied to a 700 diagram subset of more than 10,000 diagrams harvested from articles from the Open Access publisher, BioMed Central. A substantial set of Java-based tools was developed exclusively for this research. This will allow others to build on and extend what we have done.
title Feature analysis of diagrams with applications to retrieval and classification
spellingShingle Feature analysis of diagrams with applications to retrieval and classification
title_short Feature analysis of diagrams with applications to retrieval and classification
title_full Feature analysis of diagrams with applications to retrieval and classification
title_fullStr Feature analysis of diagrams with applications to retrieval and classification
title_full_unstemmed Feature analysis of diagrams with applications to retrieval and classification
title_sort feature analysis of diagrams with applications to retrieval and classification
publishDate
url http://hdl.handle.net/2047/d20002794
_version_ 1719406462401249280