Enriching lexical knowledge bases with encyclopedic relations

Lexical knowledge bases, such as WordNet, have been shown to be useful in a wide range of language processing applications. However WordNet lacks certain information, such as topical relations between synsets. This thesis addresses this problem by enriching WordNet using information derived from Wik...

Full description

Bibliographic Details
Main Author: Fernando, Samuel
Other Authors: Stevenson, Mark
Published: University of Sheffield 2013
Subjects:
401
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.574079
id ndltd-bl.uk-oai-ethos.bl.uk-574079
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5740792017-10-04T03:26:35ZEnriching lexical knowledge bases with encyclopedic relationsFernando, SamuelStevenson, Mark2013Lexical knowledge bases, such as WordNet, have been shown to be useful in a wide range of language processing applications. However WordNet lacks certain information, such as topical relations between synsets. This thesis addresses this problem by enriching WordNet using information derived from Wikipedia. The approach consists of mapping concepts in WordNet to corresponding articles in Wikipedia. This is done using a three stage approach. First a set of possible candidate articles is retrieved for each WordNet concept. This is done by searching using the article title, and also by searching the full text using an IR engine. Secondly, text similarity scores are used to select the best match from the candidate articles. Finally, the mappings are refined using information from Wikipedia links to give a set of high quality matches. The mappings are evaluated using a manually annotated gold standard set of synset-article mappings. The annotation process indicates that the majority of synsets have a good matching article. The refined mappings are shown to have precision of 88.2\%. The mappings are then used to enrich relations in WordNet using Wikipedia links. The enriched WordNet is then used with a knowledge based Word Sense Disambiguation system. Evaluations are performed on the Semcor 3.0 corpus. Adding the new relations improves performance significantly over the WordNet baseline, demonstrating the usefulness of the mappings on an extrinsic task.401University of Sheffieldhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.574079http://etheses.whiterose.ac.uk/4081/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 401
spellingShingle 401
Fernando, Samuel
Enriching lexical knowledge bases with encyclopedic relations
description Lexical knowledge bases, such as WordNet, have been shown to be useful in a wide range of language processing applications. However WordNet lacks certain information, such as topical relations between synsets. This thesis addresses this problem by enriching WordNet using information derived from Wikipedia. The approach consists of mapping concepts in WordNet to corresponding articles in Wikipedia. This is done using a three stage approach. First a set of possible candidate articles is retrieved for each WordNet concept. This is done by searching using the article title, and also by searching the full text using an IR engine. Secondly, text similarity scores are used to select the best match from the candidate articles. Finally, the mappings are refined using information from Wikipedia links to give a set of high quality matches. The mappings are evaluated using a manually annotated gold standard set of synset-article mappings. The annotation process indicates that the majority of synsets have a good matching article. The refined mappings are shown to have precision of 88.2\%. The mappings are then used to enrich relations in WordNet using Wikipedia links. The enriched WordNet is then used with a knowledge based Word Sense Disambiguation system. Evaluations are performed on the Semcor 3.0 corpus. Adding the new relations improves performance significantly over the WordNet baseline, demonstrating the usefulness of the mappings on an extrinsic task.
author2 Stevenson, Mark
author_facet Stevenson, Mark
Fernando, Samuel
author Fernando, Samuel
author_sort Fernando, Samuel
title Enriching lexical knowledge bases with encyclopedic relations
title_short Enriching lexical knowledge bases with encyclopedic relations
title_full Enriching lexical knowledge bases with encyclopedic relations
title_fullStr Enriching lexical knowledge bases with encyclopedic relations
title_full_unstemmed Enriching lexical knowledge bases with encyclopedic relations
title_sort enriching lexical knowledge bases with encyclopedic relations
publisher University of Sheffield
publishDate 2013
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.574079
work_keys_str_mv AT fernandosamuel enrichinglexicalknowledgebaseswithencyclopedicrelations
_version_ 1718544268617842688