Statistical Japanese-English Machine Translation System Using Term Extraction

碩士 === 華梵大學 === 資訊管理學系碩士班 === 97 === In this paper, we proposed to use the term extraction tool to extract the multi-word patterns before the word alignment processing in the statistical machine translation system. The identified pattern was used as a single word for alignment and translation. We d...

Full description

Bibliographic Details
Main Authors: Wen Chen Cheng, 溫振丞
Other Authors: 邊國維
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/79459738678197964758
Description
Summary:碩士 === 華梵大學 === 資訊管理學系碩士班 === 97 === In this paper, we proposed to use the term extraction tool to extract the multi-word patterns before the word alignment processing in the statistical machine translation system. The identified pattern was used as a single word for alignment and translation. We designed an English-Japanese machine translation system, which used this term extraction technology, word alignment, part of speech tagging, translation probability, and different translation models to evaluate the performances. The bilingual corpus of the NTCIR-7 Patent Translation Task is used for our experiments. In training stage, 100,000 aligned sentences are selected from the parallel corpus. The common patterns with length from two to six are extracted to process as the words. We select another 1,380 sentences for testing and evaluation. The performances of the NIST and BLEU evaluations have shown that the N-Gram Precisions of BLEU and NIST using term extraction technology are better than the method without term extraction.