Statistical-based system combination approach to gain advantages over different machine translation systems

Every machine translation system has some advantages. We propose an improved statistical system combination approach to achieve the advantages of existing machine translation systems. The primary task is to score all the phrases of the outputs of different machine translation systems selected for co...

Full description

Bibliographic Details
Main Authors: Debajyoty Banik, Asif Ekbal, Pushpak Bhattacharyya, Siddhartha Bhattacharyya, Jan Platos
Format: Article
Language:English
Published: Elsevier 2019-09-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S240584401936164X
id doaj-e998fea4e9fe41da8f7f534d2d97c48e
record_format Article
spelling doaj-e998fea4e9fe41da8f7f534d2d97c48e2020-11-25T02:04:56ZengElsevierHeliyon2405-84402019-09-0159e02504Statistical-based system combination approach to gain advantages over different machine translation systemsDebajyoty Banik0Asif Ekbal1Pushpak Bhattacharyya2Siddhartha Bhattacharyya3Jan Platos4Department of Computer Science and Engineering, India; Indian Institute of Technology Patna, IndiaDepartment of Computer Science and Engineering, India; Indian Institute of Technology Patna, IndiaDepartment of Computer Science and Engineering, India; Indian Institute of Technology Patna, IndiaFaculty of Electrical Engineering and Computer Science, VSB Technical University of Ostrava, Czech Republic; RCC Institute of Information Technology, Kolkata, India; Corresponding author at: Faculty of Electrical Engineering and Computer Science, VSB Technical University of Ostrava, Czech Republic.Faculty of Electrical Engineering and Computer Science, VSB Technical University of Ostrava, Czech RepublicEvery machine translation system has some advantages. We propose an improved statistical system combination approach to achieve the advantages of existing machine translation systems. The primary task is to score all the phrases of the outputs of different machine translation systems selected for combination. Three steps are involved in the proposed statistical system combination approach, viz., alignment, decoding, and scoring. Pair alignment is done in the first step to prevent duplication so that only a single phrase is chosen from various phrases containing the same information. Thus the alignment and scoring strategy are implemented in our approach. Hypotheses are built in the second step. In the third step, we calculate the scores for all the hypotheses. The hypothesis with the highest score is chosen as the final translated output. Wrong scoring can mislead to identify the best part from different systems. It may be noted that a particular phrase may appear in various ways in different translations. To resolve the challenges, we incorporate WordNet in the alignment phase and word2vec in the scoring phase along with the existing factors. We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy. In this work, we have merged three systems viz., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate. The broad tests of translation on eight language pairs with benchmark datasets demonstrate that the proposed system achieves better quality than the individual systems and the state-of-the-art system combination models.http://www.sciencedirect.com/science/article/pii/S240584401936164XSystem combination methodMachine translationStatistical approachNeural machine translation (NMT)Neural networkHierarchical machine translation (Hiero) systems
collection DOAJ
language English
format Article
sources DOAJ
author Debajyoty Banik
Asif Ekbal
Pushpak Bhattacharyya
Siddhartha Bhattacharyya
Jan Platos
spellingShingle Debajyoty Banik
Asif Ekbal
Pushpak Bhattacharyya
Siddhartha Bhattacharyya
Jan Platos
Statistical-based system combination approach to gain advantages over different machine translation systems
Heliyon
System combination method
Machine translation
Statistical approach
Neural machine translation (NMT)
Neural network
Hierarchical machine translation (Hiero) systems
author_facet Debajyoty Banik
Asif Ekbal
Pushpak Bhattacharyya
Siddhartha Bhattacharyya
Jan Platos
author_sort Debajyoty Banik
title Statistical-based system combination approach to gain advantages over different machine translation systems
title_short Statistical-based system combination approach to gain advantages over different machine translation systems
title_full Statistical-based system combination approach to gain advantages over different machine translation systems
title_fullStr Statistical-based system combination approach to gain advantages over different machine translation systems
title_full_unstemmed Statistical-based system combination approach to gain advantages over different machine translation systems
title_sort statistical-based system combination approach to gain advantages over different machine translation systems
publisher Elsevier
series Heliyon
issn 2405-8440
publishDate 2019-09-01
description Every machine translation system has some advantages. We propose an improved statistical system combination approach to achieve the advantages of existing machine translation systems. The primary task is to score all the phrases of the outputs of different machine translation systems selected for combination. Three steps are involved in the proposed statistical system combination approach, viz., alignment, decoding, and scoring. Pair alignment is done in the first step to prevent duplication so that only a single phrase is chosen from various phrases containing the same information. Thus the alignment and scoring strategy are implemented in our approach. Hypotheses are built in the second step. In the third step, we calculate the scores for all the hypotheses. The hypothesis with the highest score is chosen as the final translated output. Wrong scoring can mislead to identify the best part from different systems. It may be noted that a particular phrase may appear in various ways in different translations. To resolve the challenges, we incorporate WordNet in the alignment phase and word2vec in the scoring phase along with the existing factors. We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy. In this work, we have merged three systems viz., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate. The broad tests of translation on eight language pairs with benchmark datasets demonstrate that the proposed system achieves better quality than the individual systems and the state-of-the-art system combination models.
topic System combination method
Machine translation
Statistical approach
Neural machine translation (NMT)
Neural network
Hierarchical machine translation (Hiero) systems
url http://www.sciencedirect.com/science/article/pii/S240584401936164X
work_keys_str_mv AT debajyotybanik statisticalbasedsystemcombinationapproachtogainadvantagesoverdifferentmachinetranslationsystems
AT asifekbal statisticalbasedsystemcombinationapproachtogainadvantagesoverdifferentmachinetranslationsystems
AT pushpakbhattacharyya statisticalbasedsystemcombinationapproachtogainadvantagesoverdifferentmachinetranslationsystems
AT siddharthabhattacharyya statisticalbasedsystemcombinationapproachtogainadvantagesoverdifferentmachinetranslationsystems
AT janplatos statisticalbasedsystemcombinationapproachtogainadvantagesoverdifferentmachinetranslationsystems
_version_ 1724940246029697024