A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
This report introduces a suite of command-line tools created to assist content developers with the creation of rich supplementary material to use in conjunction with feature films and other video assets in language teaching. The tools are intended to leverage open-source corpora and software (the OP...
Main Author: | |
---|---|
Format: | Others |
Published: |
BYU ScholarsArchive
2010
|
Subjects: | |
Online Access: | https://scholarsarchive.byu.edu/etd/2238 https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=3237&context=etd |
id |
ndltd-BGMYU2-oai-scholarsarchive.byu.edu-etd-3237 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-BGMYU2-oai-scholarsarchive.byu.edu-etd-32372019-05-16T03:25:32Z A Hybrid System for Glossary Generation of Feature Film Content for Language Learning Corradini, Ryan Arthur This report introduces a suite of command-line tools created to assist content developers with the creation of rich supplementary material to use in conjunction with feature films and other video assets in language teaching. The tools are intended to leverage open-source corpora and software (the OPUS OpenSubs corpus and the Moses statistical machine translation system, respectively), but are written in a modular fashion so that other resources could be leveraged in their place. The completed tool suite facilitates three main tasks, which together constitute this project. First, several scripts created for use in preparing linguistic data for the system are discussed. Next, a set of scripts are described that together leverage the strengths of both terminology management and statistical machine translation to provide candidate translation entries for terms of interest. Finally, a tool chain and methodology are given for enriching the terminological data store based on the output of the machine translation process, thereby enabling greater accuracy and efficiency with each subsequent application. 2010-08-04T07:00:00Z text application/pdf https://scholarsarchive.byu.edu/etd/2238 https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=3237&context=etd http://lib.byu.edu/about/copyright/ All Theses and Dissertations BYU ScholarsArchive electronic film review language instruction statistical machine translation terminology management Linguistics |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
electronic film review language instruction statistical machine translation terminology management Linguistics |
spellingShingle |
electronic film review language instruction statistical machine translation terminology management Linguistics Corradini, Ryan Arthur A Hybrid System for Glossary Generation of Feature Film Content for Language Learning |
description |
This report introduces a suite of command-line tools created to assist content developers with the creation of rich supplementary material to use in conjunction with feature films and other video assets in language teaching. The tools are intended to leverage open-source corpora and software (the OPUS OpenSubs corpus and the Moses statistical machine translation system, respectively), but are written in a modular fashion so that other resources could be leveraged in their place. The completed tool suite facilitates three main tasks, which together constitute this project. First, several scripts created for use in preparing linguistic data for the system are discussed. Next, a set of scripts are described that together leverage the strengths of both terminology management and statistical machine translation to provide candidate translation entries for terms of interest. Finally, a tool chain and methodology are given for enriching the terminological data store based on the output of the machine translation process, thereby enabling greater accuracy and efficiency with each subsequent application. |
author |
Corradini, Ryan Arthur |
author_facet |
Corradini, Ryan Arthur |
author_sort |
Corradini, Ryan Arthur |
title |
A Hybrid System for Glossary Generation of Feature Film Content for Language Learning |
title_short |
A Hybrid System for Glossary Generation of Feature Film Content for Language Learning |
title_full |
A Hybrid System for Glossary Generation of Feature Film Content for Language Learning |
title_fullStr |
A Hybrid System for Glossary Generation of Feature Film Content for Language Learning |
title_full_unstemmed |
A Hybrid System for Glossary Generation of Feature Film Content for Language Learning |
title_sort |
hybrid system for glossary generation of feature film content for language learning |
publisher |
BYU ScholarsArchive |
publishDate |
2010 |
url |
https://scholarsarchive.byu.edu/etd/2238 https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=3237&context=etd |
work_keys_str_mv |
AT corradiniryanarthur ahybridsystemforglossarygenerationoffeaturefilmcontentforlanguagelearning AT corradiniryanarthur hybridsystemforglossarygenerationoffeaturefilmcontentforlanguagelearning |
_version_ |
1719186053057740800 |