A Hybrid System for Glossary Generation of Feature Film Content for Language Learning

This report introduces a suite of command-line tools created to assist content developers with the creation of rich supplementary material to use in conjunction with feature films and other video assets in language teaching. The tools are intended to leverage open-source corpora and software (the OP...

Full description

Bibliographic Details
Main Author: Corradini, Ryan Arthur
Format: Others
Published: BYU ScholarsArchive 2010
Subjects:
Online Access:https://scholarsarchive.byu.edu/etd/2238
https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=3237&context=etd
id ndltd-BGMYU2-oai-scholarsarchive.byu.edu-etd-3237
record_format oai_dc
spelling ndltd-BGMYU2-oai-scholarsarchive.byu.edu-etd-32372019-05-16T03:25:32Z A Hybrid System for Glossary Generation of Feature Film Content for Language Learning Corradini, Ryan Arthur This report introduces a suite of command-line tools created to assist content developers with the creation of rich supplementary material to use in conjunction with feature films and other video assets in language teaching. The tools are intended to leverage open-source corpora and software (the OPUS OpenSubs corpus and the Moses statistical machine translation system, respectively), but are written in a modular fashion so that other resources could be leveraged in their place. The completed tool suite facilitates three main tasks, which together constitute this project. First, several scripts created for use in preparing linguistic data for the system are discussed. Next, a set of scripts are described that together leverage the strengths of both terminology management and statistical machine translation to provide candidate translation entries for terms of interest. Finally, a tool chain and methodology are given for enriching the terminological data store based on the output of the machine translation process, thereby enabling greater accuracy and efficiency with each subsequent application. 2010-08-04T07:00:00Z text application/pdf https://scholarsarchive.byu.edu/etd/2238 https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=3237&context=etd http://lib.byu.edu/about/copyright/ All Theses and Dissertations BYU ScholarsArchive electronic film review language instruction statistical machine translation terminology management Linguistics
collection NDLTD
format Others
sources NDLTD
topic electronic film review
language instruction
statistical machine translation
terminology management
Linguistics
spellingShingle electronic film review
language instruction
statistical machine translation
terminology management
Linguistics
Corradini, Ryan Arthur
A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
description This report introduces a suite of command-line tools created to assist content developers with the creation of rich supplementary material to use in conjunction with feature films and other video assets in language teaching. The tools are intended to leverage open-source corpora and software (the OPUS OpenSubs corpus and the Moses statistical machine translation system, respectively), but are written in a modular fashion so that other resources could be leveraged in their place. The completed tool suite facilitates three main tasks, which together constitute this project. First, several scripts created for use in preparing linguistic data for the system are discussed. Next, a set of scripts are described that together leverage the strengths of both terminology management and statistical machine translation to provide candidate translation entries for terms of interest. Finally, a tool chain and methodology are given for enriching the terminological data store based on the output of the machine translation process, thereby enabling greater accuracy and efficiency with each subsequent application.
author Corradini, Ryan Arthur
author_facet Corradini, Ryan Arthur
author_sort Corradini, Ryan Arthur
title A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
title_short A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
title_full A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
title_fullStr A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
title_full_unstemmed A Hybrid System for Glossary Generation of Feature Film Content for Language Learning
title_sort hybrid system for glossary generation of feature film content for language learning
publisher BYU ScholarsArchive
publishDate 2010
url https://scholarsarchive.byu.edu/etd/2238
https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=3237&context=etd
work_keys_str_mv AT corradiniryanarthur ahybridsystemforglossarygenerationoffeaturefilmcontentforlanguagelearning
AT corradiniryanarthur hybridsystemforglossarygenerationoffeaturefilmcontentforlanguagelearning
_version_ 1719186053057740800