Leveraging Proximity Cues and Concept Information for Language Model Adaptation in Speech Recognition

碩士 === 國立臺灣師範大學 === 資訊工程學系 === 102 === This thesis investigates and develops language model adaptation techniques for Mandarin large vocabulary continuous speech recognition (LVCSR) and its main contribution is two-fold. First, the so-called “bag-of-words” assumption of conventional topic models is...

Full description

Bibliographic Details
Main Author: 郝柏翰
Other Authors: 陳柏琳
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/60754768721537830713
Description
Summary:碩士 === 國立臺灣師範大學 === 資訊工程學系 === 102 === This thesis investigates and develops language model adaptation techniques for Mandarin large vocabulary continuous speech recognition (LVCSR) and its main contribution is two-fold. First, the so-called “bag-of-words” assumption of conventional topic models is relaxed by additionally incorporating word proximity cues into the model formulation. By doing so, the resulting topic models can achieve better prediction capabilities for use in LVCSR. Second, we propose a novel concept language modeling (CLM) approach to rendering the relationships between a search history and an upcoming word. The instantiations of CLM can be constructed with different levels of lexical granularities, such as words and document clusters. A series of experiments on a LVCSR task demonstrate that our proposed language models can offer substantial improvements over the baseline N-gram system, and achieve performance competitive to, or better than, some state-of-the-art language models.