Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription

碩士 === 國立臺北科技大學 === 電子工程系研究所 === 105 === It is difficult to collect enough labeled data to well train a high performance Automatic Speech Recognition. However, we can easily obtain unlimited amount of unlabeled speech data. In order to gain this advantage, the objective of this paper to use semi-sup...

Full description

Bibliographic Details
Main Authors:	Sing-Yue Wang, 王星月
Other Authors:	廖元甫
Format:	Others
Language:	zh-TW
Published:	2017
Online Access:	http://ndltd.ncl.edu.tw/handle/v665zj

id	ndltd-TW-105TIT05427106
record_format	oai_dc
spelling	ndltd-TW-105TIT054271062019-05-15T23:53:44Z http://ndltd.ncl.edu.tw/handle/v665zj Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription 基於半監督式學習之廣播節目語音逐字稿自動轉寫系統 Sing-Yue Wang 王星月碩士國立臺北科技大學電子工程系研究所 105 It is difficult to collect enough labeled data to well train a high performance Automatic Speech Recognition. However, we can easily obtain unlimited amount of unlabeled speech data. In order to gain this advantage, the objective of this paper to use semi-supervised training to improve ASR’s performance. We using Quality Estimation to predict utterance WER. Then a subset of the unlabeled speech utterance which is predicted to have good recognition quality was added into the training data of the speech recognizer and retrain acoustic model. In experimental results, we evaluate two test data set of broadcast materials. The CER could be reduced from 25.00% to 23.61% and from 14.24% to 13.24% with QE-based data selection methods. We also retrain language model with Giga Word, the CER could be reduced from 23.61% to 23.25% and from 13.24% to 12.63%. Finally, we implement online Automatic Radio Transcriber provides speech recognition service. 廖元甫 2017 學位論文 ; thesis 76 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺北科技大學 === 電子工程系研究所 === 105 === It is difficult to collect enough labeled data to well train a high performance Automatic Speech Recognition. However, we can easily obtain unlimited amount of unlabeled speech data. In order to gain this advantage, the objective of this paper to use semi-supervised training to improve ASR’s performance. We using Quality Estimation to predict utterance WER. Then a subset of the unlabeled speech utterance which is predicted to have good recognition quality was added into the training data of the speech recognizer and retrain acoustic model. In experimental results, we evaluate two test data set of broadcast materials. The CER could be reduced from 25.00% to 23.61% and from 14.24% to 13.24% with QE-based data selection methods. We also retrain language model with Giga Word, the CER could be reduced from 23.61% to 23.25% and from 13.24% to 12.63%. Finally, we implement online Automatic Radio Transcriber provides speech recognition service.
author2	廖元甫
author_facet	廖元甫 Sing-Yue Wang 王星月
author	Sing-Yue Wang 王星月
spellingShingle	Sing-Yue Wang 王星月 Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
author_sort	Sing-Yue Wang
title	Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
title_short	Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
title_full	Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
title_fullStr	Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
title_full_unstemmed	Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription
title_sort	speech recognition quality estimation-based semi-supervised training for broadcast radio program transcription
publishDate	2017
url	http://ndltd.ncl.edu.tw/handle/v665zj
work_keys_str_mv	AT singyuewang speechrecognitionqualityestimationbasedsemisupervisedtrainingforbroadcastradioprogramtranscription AT wángxīngyuè speechrecognitionqualityestimationbasedsemisupervisedtrainingforbroadcastradioprogramtranscription AT singyuewang jīyúbànjiāndūshìxuéxízhīguǎngbōjiémùyǔyīnzhúzìgǎozìdòngzhuǎnxiěxìtǒng AT wángxīngyuè jīyúbànjiāndūshìxuéxízhīguǎngbōjiémùyǔyīnzhúzìgǎozìdòngzhuǎnxiěxìtǒng
_version_	1719156701604610048

Speech Recognition Quality Estimation-based Semi-Supervised Training for Broadcast Radio Program Transcription

Similar Items