Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification

碩士 === 國立中央大學 === 資訊工程學系 === 107 === As the development of deep learning, the applications of artificial intelligence become more and more popular, and the performance of speech recognition also improve a lot. Wake-up word detection is also called keyword spotting, and it deals with the identificati...

Full description

Bibliographic Details
Main Authors: YU-SIN JHOU, 周郁馨
Other Authors: 王家慶
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/tu5xzc
id ndltd-TW-107NCU05392168
record_format oai_dc
spelling ndltd-TW-107NCU053921682019-10-22T05:28:16Z http://ndltd.ncl.edu.tw/handle/tu5xzc Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification 基於長短期記憶網路和連結時序分類的喚醒詞辨識 YU-SIN JHOU 周郁馨 碩士 國立中央大學 資訊工程學系 107 As the development of deep learning, the applications of artificial intelligence become more and more popular, and the performance of speech recognition also improve a lot. Wake-up word detection is also called keyword spotting, and it deals with the identification of keyword in audio signal. For now, Deep learning has better performance than traditional way such as hidden Markov model (HMM). To get a deep learning wake-up word model (for example, deep neural network, recurrent neural network), we have to used lots of specific word audio to train the model so that the model can learn the feature in wake-up word audio and predict if wake-up word is in the continuous audio signal. However, these keyword detection systems can only detect fixed keyword. If we want to change the keyword or add new keyword into system, we have to collect new keyword-specific data and re-train the model. In this thesis, we use long short-term memory network (LSTM) and connectionist temporal classifier (CTC) as keyword detection model. It is different from general keyword detection because this system uses LSTM to predict the posterior of phoneme and CTC to produce the possibility of the phoneme sequence. Due to predicting phoneme sequence, we can use non-keyword data as training data and let the model predict sequence more accurately. Besides, when changing the wake-up word, this system does not have to re-train. We just need to use some new wake-up word data to modify the system. 王家慶 2019 學位論文 ; thesis 32 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 資訊工程學系 === 107 === As the development of deep learning, the applications of artificial intelligence become more and more popular, and the performance of speech recognition also improve a lot. Wake-up word detection is also called keyword spotting, and it deals with the identification of keyword in audio signal. For now, Deep learning has better performance than traditional way such as hidden Markov model (HMM). To get a deep learning wake-up word model (for example, deep neural network, recurrent neural network), we have to used lots of specific word audio to train the model so that the model can learn the feature in wake-up word audio and predict if wake-up word is in the continuous audio signal. However, these keyword detection systems can only detect fixed keyword. If we want to change the keyword or add new keyword into system, we have to collect new keyword-specific data and re-train the model. In this thesis, we use long short-term memory network (LSTM) and connectionist temporal classifier (CTC) as keyword detection model. It is different from general keyword detection because this system uses LSTM to predict the posterior of phoneme and CTC to produce the possibility of the phoneme sequence. Due to predicting phoneme sequence, we can use non-keyword data as training data and let the model predict sequence more accurately. Besides, when changing the wake-up word, this system does not have to re-train. We just need to use some new wake-up word data to modify the system.
author2 王家慶
author_facet 王家慶
YU-SIN JHOU
周郁馨
author YU-SIN JHOU
周郁馨
spellingShingle YU-SIN JHOU
周郁馨
Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification
author_sort YU-SIN JHOU
title Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification
title_short Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification
title_full Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification
title_fullStr Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification
title_full_unstemmed Wake-up Word Detection Using Long Short Term Memory Network and Connectionist Temporal Classification
title_sort wake-up word detection using long short term memory network and connectionist temporal classification
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/tu5xzc
work_keys_str_mv AT yusinjhou wakeupworddetectionusinglongshorttermmemorynetworkandconnectionisttemporalclassification
AT zhōuyùxīn wakeupworddetectionusinglongshorttermmemorynetworkandconnectionisttemporalclassification
AT yusinjhou jīyúzhǎngduǎnqījìyìwǎnglùhéliánjiéshíxùfēnlèidehuànxǐngcíbiànshí
AT zhōuyùxīn jīyúzhǎngduǎnqījìyìwǎnglùhéliánjiéshíxùfēnlèidehuànxǐngcíbiànshí
_version_ 1719274256812998656