Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer

博士 === 國立清華大學 === 資訊工程學系 === 101 === A bilingual concordancer is a computer-assisted translation tool that uses the parallel corpus as its knowledge base. Given a word or phrase, the bilingual concordancer retrieves aligned sentence pairs, which contain the word or phrase in the source sentences, fr...

Full description

Bibliographic Details
Main Authors: Bai, Ming-Hong, 白明弘
Other Authors: Chang, Jason S.
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/63001866540723370915
id ndltd-TW-101NTHU5392118
record_format oai_dc
spelling ndltd-TW-101NTHU53921182015-10-13T22:29:58Z http://ndltd.ncl.edu.tw/handle/63001866540723370915 Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer Bai, Ming-Hong 白明弘 博士 國立清華大學 資訊工程學系 101 A bilingual concordancer is a computer-assisted translation tool that uses the parallel corpus as its knowledge base. Given a word or phrase, the bilingual concordancer retrieves aligned sentence pairs, which contain the word or phrase in the source sentences, from the parallel corpus. Then, it identifies the translation equivalents in the target sentences and reorders the sentence pairs according to the correlation from the query string and the translation equivalents. It helps not only on finding translation equivalents of the query but also presenting various contexts of occurrence. As a result, it is extremely useful for bilingual lexicographers, human translators and second language learners. Extraction of bilingual multi-word expressions is the most important part of a bilingual concordancer. For example, highlighting translation equivalents in the target sentence and generating translation equivalent list are highly depend on a high quality extraction model. However, the existing models for extracting translation equivalents still have many problems and still room to improve. In this thesis, we discuss some problems of the existing models for extracting bilingual multi-word expressions, including the over-alignment problem and the under-alignment problem. Then, we propose a novel model to address these problems to improve the quality the extracted translation equivalents. Further, we implement a bilingual concordancer employs the proposed translation extraction model. To measure the performance of the bilingual concordancer, we use three type of multi-word expression as our test target. The results are compared with the existing statistical machine translation models. Chang, Jason S. Chen, Keh-Jiann 張俊盛 陳克健 2013 學位論文 ; thesis 97 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立清華大學 === 資訊工程學系 === 101 === A bilingual concordancer is a computer-assisted translation tool that uses the parallel corpus as its knowledge base. Given a word or phrase, the bilingual concordancer retrieves aligned sentence pairs, which contain the word or phrase in the source sentences, from the parallel corpus. Then, it identifies the translation equivalents in the target sentences and reorders the sentence pairs according to the correlation from the query string and the translation equivalents. It helps not only on finding translation equivalents of the query but also presenting various contexts of occurrence. As a result, it is extremely useful for bilingual lexicographers, human translators and second language learners. Extraction of bilingual multi-word expressions is the most important part of a bilingual concordancer. For example, highlighting translation equivalents in the target sentence and generating translation equivalent list are highly depend on a high quality extraction model. However, the existing models for extracting translation equivalents still have many problems and still room to improve. In this thesis, we discuss some problems of the existing models for extracting bilingual multi-word expressions, including the over-alignment problem and the under-alignment problem. Then, we propose a novel model to address these problems to improve the quality the extracted translation equivalents. Further, we implement a bilingual concordancer employs the proposed translation extraction model. To measure the performance of the bilingual concordancer, we use three type of multi-word expression as our test target. The results are compared with the existing statistical machine translation models.
author2 Chang, Jason S.
author_facet Chang, Jason S.
Bai, Ming-Hong
白明弘
author Bai, Ming-Hong
白明弘
spellingShingle Bai, Ming-Hong
白明弘
Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer
author_sort Bai, Ming-Hong
title Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer
title_short Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer
title_full Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer
title_fullStr Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer
title_full_unstemmed Extraction of Bilingual Multiword Expressions with Application to Bilingual Concordancer
title_sort extraction of bilingual multiword expressions with application to bilingual concordancer
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/63001866540723370915
work_keys_str_mv AT baiminghong extractionofbilingualmultiwordexpressionswithapplicationtobilingualconcordancer
AT báimínghóng extractionofbilingualmultiwordexpressionswithapplicationtobilingualconcordancer
_version_ 1718077412900601856