Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate
碩士 === 國立暨南國際大學 === 資訊工程學系 === 102 === Before generating the phrase translation tables in traditional phrase-based statistical machine translation (PB SMT), word alignment will be conducted first. Heuristics are then used to find the possible phrase pairs, thus producing the Phrase Translation Table...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/72467599609554966608 |
id |
ndltd-TW-101NCNU0392049 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101NCNU03920492016-03-16T04:14:50Z http://ndltd.ncl.edu.tw/handle/72467599609554966608 Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate 以詞組對涵蓋率縮減詞組為本之統計式機器翻譯雙語對照表 Peng, Wei-Gang 彭維剛 碩士 國立暨南國際大學 資訊工程學系 102 Before generating the phrase translation tables in traditional phrase-based statistical machine translation (PB SMT), word alignment will be conducted first. Heuristics are then used to find the possible phrase pairs, thus producing the Phrase Translation Tables. Since the phrases and phrase pairs are induced from word alignment without phrase segmentation criteria, it is possible to produce a large number of messy phrases. In this paper, we propose a “Phrase Pair Coverage Rate” measure to help reduce the Phrase Translation Table. We firstly use an EM algorithm to find the best phrase segmentation of the source and target sentences, later in the Translation Model training (TM training) step of Moses, the phrase segmentation information is used, based on the “Phrase Pair Coverage Rate” to reduce the size of the Phrase Translation Table. The reduced Phrase Translation Table is then used with the Language Model (LM) to decode (i.e., to translate) the source sentences. Finally, the BLEU score is estimated to evaluate the translation quality, and comparison is made against the Moses SMT system. The experimental results show that phrase table size can be reduced by about 60~70%, and the BLEU score is close to Moses performance. Chang, Jing-Shin 張景新 2014 學位論文 ; thesis 33 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立暨南國際大學 === 資訊工程學系 === 102 === Before generating the phrase translation tables in traditional phrase-based statistical machine translation (PB SMT), word alignment will be conducted first. Heuristics are then used to find the possible phrase pairs, thus producing the Phrase Translation Tables. Since the phrases and phrase pairs are induced from word alignment without phrase segmentation criteria, it is possible to produce a large number of messy phrases.
In this paper, we propose a “Phrase Pair Coverage Rate” measure to help reduce the Phrase Translation Table. We firstly use an EM algorithm to find the best phrase segmentation of the source and target sentences, later in the Translation Model training (TM training) step of Moses, the phrase segmentation information is used, based on the “Phrase Pair Coverage Rate” to reduce the size of the Phrase Translation Table. The reduced Phrase Translation Table is then used with the Language Model (LM) to decode (i.e., to translate) the source sentences. Finally, the BLEU score is estimated to evaluate the translation quality, and comparison is made against the Moses SMT system.
The experimental results show that phrase table size can be reduced by about 60~70%, and the BLEU score is close to Moses performance.
|
author2 |
Chang, Jing-Shin |
author_facet |
Chang, Jing-Shin Peng, Wei-Gang 彭維剛 |
author |
Peng, Wei-Gang 彭維剛 |
spellingShingle |
Peng, Wei-Gang 彭維剛 Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate |
author_sort |
Peng, Wei-Gang |
title |
Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate |
title_short |
Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate |
title_full |
Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate |
title_fullStr |
Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate |
title_full_unstemmed |
Reducing Phrase-based SMT Translation Tables by Phrase Pair Coverage Rate |
title_sort |
reducing phrase-based smt translation tables by phrase pair coverage rate |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/72467599609554966608 |
work_keys_str_mv |
AT pengweigang reducingphrasebasedsmttranslationtablesbyphrasepaircoveragerate AT péngwéigāng reducingphrasebasedsmttranslationtablesbyphrasepaircoveragerate AT pengweigang yǐcízǔduìhángàilǜsuōjiǎncízǔwèiběnzhītǒngjìshìjīqìfānyìshuāngyǔduìzhàobiǎo AT péngwéigāng yǐcízǔduìhángàilǜsuōjiǎncízǔwèiběnzhītǒngjìshìjīqìfānyìshuāngyǔduìzhàobiǎo |
_version_ |
1718205919654838272 |