Computational Terminology : Exploring Bilingual and Monolingual Term Extraction
Terminologies are becoming more important to modern day society as technology and science continue to grow at an accelerating rate in a globalized environment. Agreeing upon which terms should be used to represent which concepts and how those terms should be translated into different languages is im...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Linköpings universitet, NLPLAB - Laboratoriet för databehandling av naturligt språk
2012
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-75243 http://nbn-resolving.de/urn:isbn:9789175199443 |
id |
ndltd-UPSALLA1-oai-DiVA.org-liu-75243 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-liu-752432020-08-28T05:37:25ZComputational Terminology : Exploring Bilingual and Monolingual Term ExtractionengFoo, JodyLinköpings universitet, NLPLAB - Laboratoriet för databehandling av naturligt språkLinköpings universitet, Tekniska högskolanLinköping2012terminologyautomatic term extractionautomatic term recognitioncomputational terminologyterminology managementLanguage Technology (Computational Linguistics)Språkteknologi (språkvetenskaplig databehandling)Terminologies are becoming more important to modern day society as technology and science continue to grow at an accelerating rate in a globalized environment. Agreeing upon which terms should be used to represent which concepts and how those terms should be translated into different languages is important if we wish to be able to communicate with as little confusion and misunderstandings as possible. Since the 1990s, an increasing amount of terminology research has been devoted to facilitating and augmenting terminology-related tasks by using computers and computational methods. One focus for this research is Automatic Term Extraction (ATE). In this compilation thesis, studies on both bilingual and monolingual ATE are presented. First, two publications reporting on how bilingual ATE using the align-extract approach can be used to extract patent terms. The result in this case was 181,000 manually validated English-Swedish patent terms which were to be used in a machine translation system for patent documents. A critical component of the method used is the Q-value metric, presented in the third paper, which can be used to rank extracted term candidates (TC) in an order that correlates with TC precision. The use of Machine Learning (ML) in monolingual ATE is the topic of the two final contributions. The first ML-related publication shows that rule induction based ML can be used to generate linguistic term selection patterns, and in the second ML-related publication, contrastive n-gram language models are used in conjunction with SVM ML to improve the precision of term candidates selected using linguistic patterns. Licentiate thesis, comprehensive summaryinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-75243urn:isbn:9789175199443Local LiU-TEK-LIC-201285Linköping Studies in Science and Technology. Thesis, 0280-7971 ; 1523application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
terminology automatic term extraction automatic term recognition computational terminology terminology management Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling) |
spellingShingle |
terminology automatic term extraction automatic term recognition computational terminology terminology management Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling) Foo, Jody Computational Terminology : Exploring Bilingual and Monolingual Term Extraction |
description |
Terminologies are becoming more important to modern day society as technology and science continue to grow at an accelerating rate in a globalized environment. Agreeing upon which terms should be used to represent which concepts and how those terms should be translated into different languages is important if we wish to be able to communicate with as little confusion and misunderstandings as possible. Since the 1990s, an increasing amount of terminology research has been devoted to facilitating and augmenting terminology-related tasks by using computers and computational methods. One focus for this research is Automatic Term Extraction (ATE). In this compilation thesis, studies on both bilingual and monolingual ATE are presented. First, two publications reporting on how bilingual ATE using the align-extract approach can be used to extract patent terms. The result in this case was 181,000 manually validated English-Swedish patent terms which were to be used in a machine translation system for patent documents. A critical component of the method used is the Q-value metric, presented in the third paper, which can be used to rank extracted term candidates (TC) in an order that correlates with TC precision. The use of Machine Learning (ML) in monolingual ATE is the topic of the two final contributions. The first ML-related publication shows that rule induction based ML can be used to generate linguistic term selection patterns, and in the second ML-related publication, contrastive n-gram language models are used in conjunction with SVM ML to improve the precision of term candidates selected using linguistic patterns. |
author |
Foo, Jody |
author_facet |
Foo, Jody |
author_sort |
Foo, Jody |
title |
Computational Terminology : Exploring Bilingual and Monolingual Term Extraction |
title_short |
Computational Terminology : Exploring Bilingual and Monolingual Term Extraction |
title_full |
Computational Terminology : Exploring Bilingual and Monolingual Term Extraction |
title_fullStr |
Computational Terminology : Exploring Bilingual and Monolingual Term Extraction |
title_full_unstemmed |
Computational Terminology : Exploring Bilingual and Monolingual Term Extraction |
title_sort |
computational terminology : exploring bilingual and monolingual term extraction |
publisher |
Linköpings universitet, NLPLAB - Laboratoriet för databehandling av naturligt språk |
publishDate |
2012 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-75243 http://nbn-resolving.de/urn:isbn:9789175199443 |
work_keys_str_mv |
AT foojody computationalterminologyexploringbilingualandmonolingualtermextraction |
_version_ |
1719338772364001280 |