Ambiguity and Incomplete Information in Categorical Models of Language

We investigate notions of ambiguity and partial information in categorical distributional models of natural language. Probabilistic ambiguity has previously been studied using Selinger's CPM construction. This construction works well for models built upon vector spaces, as has been shown in qua...

Full description

Bibliographic Details
Main Author:	Dan Marsden
Format:	Article
Language:	English
Published:	Open Publishing Association 2017-01-01
Series:	Electronic Proceedings in Theoretical Computer Science
Online Access:	http://arxiv.org/pdf/1701.00660v1

id	doaj-aa891153c07547878b954713050c2642
record_format	Article
spelling	doaj-aa891153c07547878b954713050c26422020-11-25T01:05:54ZengOpen Publishing AssociationElectronic Proceedings in Theoretical Computer Science2075-21802017-01-01236Proc. QPL 20169510710.4204/EPTCS.236.7:17Ambiguity and Incomplete Information in Categorical Models of LanguageDan Marsden0 University of Oxford We investigate notions of ambiguity and partial information in categorical distributional models of natural language. Probabilistic ambiguity has previously been studied using Selinger's CPM construction. This construction works well for models built upon vector spaces, as has been shown in quantum computational applications. Unfortunately, it doesn't seem to provide a satisfactory method for introducing mixing in other compact closed categories such as the category of sets and binary relations. We therefore lack a uniform strategy for extending a category to model imprecise linguistic information. In this work we adopt a different approach. We analyze different forms of ambiguous and incomplete information, both with and without quantitative probabilistic data. Each scheme then corresponds to a suitable enrichment of the category in which we model language. We view different monads as encapsulating the informational behaviour of interest, by analogy with their use in modelling side effects in computation. Previous results of Jacobs then allow us to systematically construct suitable bases for enrichment. We show that we can freely enrich arbitrary dagger compact closed categories in order to capture all the phenomena of interest, whilst retaining the important dagger compact closed structure. This allows us to construct a model with real convex combination of binary relations that makes non-trivial use of the scalars. Finally we relate our various different enrichments, showing that finite subconvex algebra enrichment covers all the effects under consideration.http://arxiv.org/pdf/1701.00660v1
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Dan Marsden
spellingShingle	Dan Marsden Ambiguity and Incomplete Information in Categorical Models of Language Electronic Proceedings in Theoretical Computer Science
author_facet	Dan Marsden
author_sort	Dan Marsden
title	Ambiguity and Incomplete Information in Categorical Models of Language
title_short	Ambiguity and Incomplete Information in Categorical Models of Language
title_full	Ambiguity and Incomplete Information in Categorical Models of Language
title_fullStr	Ambiguity and Incomplete Information in Categorical Models of Language
title_full_unstemmed	Ambiguity and Incomplete Information in Categorical Models of Language
title_sort	ambiguity and incomplete information in categorical models of language
publisher	Open Publishing Association
series	Electronic Proceedings in Theoretical Computer Science
issn	2075-2180
publishDate	2017-01-01
description	We investigate notions of ambiguity and partial information in categorical distributional models of natural language. Probabilistic ambiguity has previously been studied using Selinger's CPM construction. This construction works well for models built upon vector spaces, as has been shown in quantum computational applications. Unfortunately, it doesn't seem to provide a satisfactory method for introducing mixing in other compact closed categories such as the category of sets and binary relations. We therefore lack a uniform strategy for extending a category to model imprecise linguistic information. In this work we adopt a different approach. We analyze different forms of ambiguous and incomplete information, both with and without quantitative probabilistic data. Each scheme then corresponds to a suitable enrichment of the category in which we model language. We view different monads as encapsulating the informational behaviour of interest, by analogy with their use in modelling side effects in computation. Previous results of Jacobs then allow us to systematically construct suitable bases for enrichment. We show that we can freely enrich arbitrary dagger compact closed categories in order to capture all the phenomena of interest, whilst retaining the important dagger compact closed structure. This allows us to construct a model with real convex combination of binary relations that makes non-trivial use of the scalars. Finally we relate our various different enrichments, showing that finite subconvex algebra enrichment covers all the effects under consideration.
url	http://arxiv.org/pdf/1701.00660v1
work_keys_str_mv	AT danmarsden ambiguityandincompleteinformationincategoricalmodelsoflanguage
_version_	1725192643519971328

Ambiguity and Incomplete Information in Categorical Models of Language

Similar Items