Vector representations of structured data

The connectionist approach to creating vector representations (VREPs) of structured data is usually implemented by artificial neural network (ANN) architectures. ANNs are trained on a representative corpus and can then demonstrate some degree of generalization to novel data. In this context, structu...

Full description

Bibliographic Details
Main Author: Mintram, Robert C.
Published: Southampton Solent University 2002
Subjects:
006
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.250160
Description
Summary:The connectionist approach to creating vector representations (VREPs) of structured data is usually implemented by artificial neural network (ANN) architectures. ANNs are trained on a representative corpus and can then demonstrate some degree of generalization to novel data. In this context, structured data are typically trees, the leaf nodes of which are assigned some n-element (often binary) vector representation. The strategy used to encode the leaf data and the width of the consequent vectors can have an impact on the encoding performance of the ANN architecture. In this thesis the architecture of principle interest is called simplified recursive auto associative memory, (S)RAAM, which was devised to provide a theoretical model for abother architecture called recursive auto associative memory, RAAM. Research continues in RAAMs in terms of improving their learning ability, understanding the features that are encoded and improving generalization. (S)RAAM is a mathematical model that lends itself more readily to addressing these issues. Usually ANNs designed to encode structured data will, as a result of training, simultaneously create an encoder function to transform the data into vectors and a decoder function to perform the reverse transformation. (S)RAAM is a mathematical model that lends itself more readily to addressing these issues. Usually ANNs designed to encode structured data will, as a result of training, simultaneously create an encoder function to transform the data into vectors and a decoder function to perform the reverse transformation. (S)RAAM as a model of this process was designed to follow this paradigm. It is shown that this is not strictly necessary and that encoder and decoder functions can be created at separate times, their connection being maintained by the data unpon which they operate. This leads to a new, more versatile model called, in this thesis, the General Encoder Decoder, GED. The GED, like (S)RAAM, is implemented as an algorithm rather than a neural network architecture. The thesis contends that the broad scope of the GED model makes it a versatile experimental vehicle supporting research into key properties of VREPs. In particular these properties include the strategy used to encode the leaf tokens within tree structures and the features of these structures that are preferentially encoded