On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System

碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort...

Full description

Bibliographic Details
Main Authors:	Yuan-Ning Lin, 林苑寧
Other Authors:	Jhing-Fa Wang
Format:	Others
Language:	en_US
Published:	2008
Online Access:	http://ndltd.ncl.edu.tw/handle/61962122110908352771

id	ndltd-TW-096NCKU5442197
record_format	oai_dc
spelling	ndltd-TW-096NCKU54421972017-07-20T04:35:15Z http://ndltd.ncl.edu.tw/handle/61962122110908352771 On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System 應用SVM與MLLR於多人線上語者調適之泛在語音辨識系統 Yuan-Ning Lin 林苑寧碩士國立成功大學電機工程學系碩博士班 96 Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort target. Currently, most living speech recognition applications need either a speaker independent model or a dependent model with user training at first, such as a voice-activated toy and a mobile phone with voice dialing. We think that the speech recognition with speaker independent model is not convenient enough for home environment with fixed family members. Therefore, this thesis proposes an on-line multi-speaker adaptation based on SVM and MLLR for ubiquitous speech recognition system. This system can not only speech recognizes with the appropriate model for every family member to improve accuracy, but also on-line adapts the acoustic model to be near to the speakers’ characteristic when they use the system. The presented novel architecture can improve the adaptation modeling accuracy of the conventional maximum likelihood linear regression (MLLR) technique. The proposed system contains three phases: the training phase, the recognition phase, and the adaptation phase. First, in the training phase, we generate MLLR regression matrix sets of the Gaussian mixture model (GMM) parameters to construct the eigen-space, and build SVM speaker classification model by a few training data. Next, in the recognition phase, we integrate the speaker independent model with the MLLR transformation matrix set generated from the MLLR transformation matrix set database by the speaker class result of SVM classifier. Then we recognize the test speech by this adapted model. In the adaptation phase, we replace the present MLLR transformation matrix set by the adapted MLLR transformation matrix set merged by weighting from three transformation matrix sets: 1) the present MLLR transformation matrix set; 2) the MLLR regression transformation matrix set estimated from by maximum likelihood from eigenspace and recognition result; and 3) the MLLR regression transformation matrix set adapted by the speech recognition results which are judged by confidence measure to decrease the error training because the noise or the wrong speech recognition. The experimental results show that the proposed method can averagely improve speech recognition accuracy about 3% ~8% with speaker adaptation. Jhing-Fa Wang 王駿發 2008 學位論文 ; thesis 70 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort target. Currently, most living speech recognition applications need either a speaker independent model or a dependent model with user training at first, such as a voice-activated toy and a mobile phone with voice dialing. We think that the speech recognition with speaker independent model is not convenient enough for home environment with fixed family members. Therefore, this thesis proposes an on-line multi-speaker adaptation based on SVM and MLLR for ubiquitous speech recognition system. This system can not only speech recognizes with the appropriate model for every family member to improve accuracy, but also on-line adapts the acoustic model to be near to the speakers’ characteristic when they use the system. The presented novel architecture can improve the adaptation modeling accuracy of the conventional maximum likelihood linear regression (MLLR) technique. The proposed system contains three phases: the training phase, the recognition phase, and the adaptation phase. First, in the training phase, we generate MLLR regression matrix sets of the Gaussian mixture model (GMM) parameters to construct the eigen-space, and build SVM speaker classification model by a few training data. Next, in the recognition phase, we integrate the speaker independent model with the MLLR transformation matrix set generated from the MLLR transformation matrix set database by the speaker class result of SVM classifier. Then we recognize the test speech by this adapted model. In the adaptation phase, we replace the present MLLR transformation matrix set by the adapted MLLR transformation matrix set merged by weighting from three transformation matrix sets: 1) the present MLLR transformation matrix set; 2) the MLLR regression transformation matrix set estimated from by maximum likelihood from eigenspace and recognition result; and 3) the MLLR regression transformation matrix set adapted by the speech recognition results which are judged by confidence measure to decrease the error training because the noise or the wrong speech recognition. The experimental results show that the proposed method can averagely improve speech recognition accuracy about 3% ~8% with speaker adaptation.
author2	Jhing-Fa Wang
author_facet	Jhing-Fa Wang Yuan-Ning Lin 林苑寧
author	Yuan-Ning Lin 林苑寧
spellingShingle	Yuan-Ning Lin 林苑寧 On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
author_sort	Yuan-Ning Lin
title	On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_short	On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_full	On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_fullStr	On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_full_unstemmed	On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_sort	on-line multi-speaker adaptation based on svm and mllr for ubiquitous speech recognition system
publishDate	2008
url	http://ndltd.ncl.edu.tw/handle/61962122110908352771
work_keys_str_mv	AT yuanninglin onlinemultispeakeradaptationbasedonsvmandmllrforubiquitousspeechrecognitionsystem AT línyuànníng onlinemultispeakeradaptationbasedonsvmandmllrforubiquitousspeechrecognitionsystem AT yuanninglin yīngyòngsvmyǔmllryúduōrénxiànshàngyǔzhědiàoshìzhīfànzàiyǔyīnbiànshíxìtǒng AT línyuànníng yīngyòngsvmyǔmllryúduōrénxiànshàngyǔzhědiàoshìzhīfànzàiyǔyīnbiànshíxìtǒng
_version_	1718502103632052224

On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System

Similar Items