Inferring unobserved co-occurrence events in Anchored Packed Trees

Anchored Packed Trees (APTs) are a novel approach to distributional semantics that takes distributional composition to be a process of lexeme contextualisation. A lexeme's meaning, characterised as knowledge concerning co-occurrences involving that lexeme, is represented with a higher-order dep...

Full description

Bibliographic Details
Main Author:	Kober, Thomas Helmut
Published:	University of Sussex 2018
Subjects:	004 Q0387.5 Semantic networks
Online Access:	https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.742157

id	ndltd-bl.uk-oai-ethos.bl.uk-742157
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-7421572019-03-05T15:19:02ZInferring unobserved co-occurrence events in Anchored Packed TreesKober, Thomas Helmut2018Anchored Packed Trees (APTs) are a novel approach to distributional semantics that takes distributional composition to be a process of lexeme contextualisation. A lexeme's meaning, characterised as knowledge concerning co-occurrences involving that lexeme, is represented with a higher-order dependency-typed structure (the APT) where paths associated with higher-order dependencies connect vertices associated with weighted lexeme multisets. The central innovation in the compositional theory is that the APT's type structure enables the precise alignment of the semantic representation of each of the lexemes being composed. Like other count-based distributional spaces, however, Anchored Packed Trees are prone to considerable data sparsity, caused by not observing all plausible co-occurrences in the given data. This problem is amplified for models like APTs, that take the grammatical type of a co-occurrence into account. This results in a very sparse distributional space, requiring a mechanism for inferring missing knowledge. Most methods face this challenge in ways that render the resulting word representations uninterpretable, with the consequence that distributional composition becomes difficult to model and reason about. In this thesis, I will present a practical evaluation of the Apt theory, including a large-scale hyperparameter sensitivity study and a characterisation of the distributional space that APTs give rise to. Based on the empirical analysis, the impact of the problem of data sparsity is investigated. In order to address the data sparsity challenge and retain the interpretability of the model, I explore an alternative algorithm — distributional inference — for improving elementary representations. The algorithm involves explicitly inferring unobserved co-occurrence events by leveraging the distributional neighbourhood of the semantic space. I then leverage the rich type structure in APTs and propose a generalisation of the distributional inference algorithm. I empirically show that distributional inference improves elementary word representations and is especially beneficial when combined with an intersective composition function, which is due to the complementary nature of inference and composition. Lastly, I qualitatively analyse the proposed algorithms in order to characterise the knowledge that they are able to infer, as well as their impact on the distributional APT space.004Q0387.5 Semantic networksUniversity of Sussexhttps://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.742157http://sro.sussex.ac.uk/id/eprint/75718/Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	004 Q0387.5 Semantic networks
spellingShingle	004 Q0387.5 Semantic networks Kober, Thomas Helmut Inferring unobserved co-occurrence events in Anchored Packed Trees
description	Anchored Packed Trees (APTs) are a novel approach to distributional semantics that takes distributional composition to be a process of lexeme contextualisation. A lexeme's meaning, characterised as knowledge concerning co-occurrences involving that lexeme, is represented with a higher-order dependency-typed structure (the APT) where paths associated with higher-order dependencies connect vertices associated with weighted lexeme multisets. The central innovation in the compositional theory is that the APT's type structure enables the precise alignment of the semantic representation of each of the lexemes being composed. Like other count-based distributional spaces, however, Anchored Packed Trees are prone to considerable data sparsity, caused by not observing all plausible co-occurrences in the given data. This problem is amplified for models like APTs, that take the grammatical type of a co-occurrence into account. This results in a very sparse distributional space, requiring a mechanism for inferring missing knowledge. Most methods face this challenge in ways that render the resulting word representations uninterpretable, with the consequence that distributional composition becomes difficult to model and reason about. In this thesis, I will present a practical evaluation of the Apt theory, including a large-scale hyperparameter sensitivity study and a characterisation of the distributional space that APTs give rise to. Based on the empirical analysis, the impact of the problem of data sparsity is investigated. In order to address the data sparsity challenge and retain the interpretability of the model, I explore an alternative algorithm — distributional inference — for improving elementary representations. The algorithm involves explicitly inferring unobserved co-occurrence events by leveraging the distributional neighbourhood of the semantic space. I then leverage the rich type structure in APTs and propose a generalisation of the distributional inference algorithm. I empirically show that distributional inference improves elementary word representations and is especially beneficial when combined with an intersective composition function, which is due to the complementary nature of inference and composition. Lastly, I qualitatively analyse the proposed algorithms in order to characterise the knowledge that they are able to infer, as well as their impact on the distributional APT space.
author	Kober, Thomas Helmut
author_facet	Kober, Thomas Helmut
author_sort	Kober, Thomas Helmut
title	Inferring unobserved co-occurrence events in Anchored Packed Trees
title_short	Inferring unobserved co-occurrence events in Anchored Packed Trees
title_full	Inferring unobserved co-occurrence events in Anchored Packed Trees
title_fullStr	Inferring unobserved co-occurrence events in Anchored Packed Trees
title_full_unstemmed	Inferring unobserved co-occurrence events in Anchored Packed Trees
title_sort	inferring unobserved co-occurrence events in anchored packed trees
publisher	University of Sussex
publishDate	2018
url	https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.742157
work_keys_str_mv	AT koberthomashelmut inferringunobservedcooccurrenceeventsinanchoredpackedtrees
_version_	1718991533443645440

Inferring unobserved co-occurrence events in Anchored Packed Trees

Similar Items