Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
碩士 === 國立臺北科技大學 === 電子工程系研究所 === 105 === It is difficult to collect enough labeled data to well train a high performance Automatic Speech Recognition. However, we can easily obtain unlimited amount of unlabeled speech data. In order to gain this advantage, the objective of this paper to use semi-sup...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/v665zj |
id |
ndltd-TW-105TIT05427106 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105TIT054271062019-05-15T23:53:44Z http://ndltd.ncl.edu.tw/handle/v665zj Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription 基於半監督式學習之廣播節目語音逐字稿自動轉寫系統 Sing-Yue Wang 王星月 碩士 國立臺北科技大學 電子工程系研究所 105 It is difficult to collect enough labeled data to well train a high performance Automatic Speech Recognition. However, we can easily obtain unlimited amount of unlabeled speech data. In order to gain this advantage, the objective of this paper to use semi-supervised training to improve ASR’s performance. We using Quality Estimation to predict utterance WER. Then a subset of the unlabeled speech utterance which is predicted to have good recognition quality was added into the training data of the speech recognizer and retrain acoustic model. In experimental results, we evaluate two test data set of broadcast materials. The CER could be reduced from 25.00% to 23.61% and from 14.24% to 13.24% with QE-based data selection methods. We also retrain language model with Giga Word, the CER could be reduced from 23.61% to 23.25% and from 13.24% to 12.63%. Finally, we implement online Automatic Radio Transcriber provides speech recognition service. 廖元甫 2017 學位論文 ; thesis 76 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺北科技大學 === 電子工程系研究所 === 105 === It is difficult to collect enough labeled data to well train a high performance Automatic Speech Recognition. However, we can easily obtain unlimited amount of unlabeled speech data. In order to gain this advantage, the objective of this paper to use semi-supervised training to improve ASR’s performance. We using Quality Estimation to predict utterance WER. Then a subset of the unlabeled speech utterance which is predicted to have good recognition quality was added into the training data of the speech recognizer and retrain acoustic model. In experimental results, we evaluate two test data set of broadcast materials. The CER could be reduced from 25.00% to 23.61% and from 14.24% to 13.24% with QE-based data selection methods. We also retrain language model with Giga Word, the CER could be reduced from 23.61% to 23.25% and from 13.24% to 12.63%. Finally, we implement online Automatic Radio Transcriber provides speech recognition service.
|
author2 |
廖元甫 |
author_facet |
廖元甫 Sing-Yue Wang 王星月 |
author |
Sing-Yue Wang 王星月 |
spellingShingle |
Sing-Yue Wang 王星月 Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription |
author_sort |
Sing-Yue Wang |
title |
Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription |
title_short |
Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription |
title_full |
Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription |
title_fullStr |
Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription |
title_full_unstemmed |
Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription |
title_sort |
speech recognition quality estimation-based semi-supervised training for broadcast radio program transcription |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/v665zj |
work_keys_str_mv |
AT singyuewang speechrecognitionqualityestimationbasedsemisupervisedtrainingforbroadcastradioprogramtranscription AT wángxīngyuè speechrecognitionqualityestimationbasedsemisupervisedtrainingforbroadcastradioprogramtranscription AT singyuewang jīyúbànjiāndūshìxuéxízhīguǎngbōjiémùyǔyīnzhúzìgǎozìdòngzhuǎnxiěxìtǒng AT wángxīngyuè jīyúbànjiāndūshìxuéxízhīguǎngbōjiémùyǔyīnzhúzìgǎozìdòngzhuǎnxiěxìtǒng |
_version_ |
1719156701604610048 |