On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System

碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort...

Full description

Bibliographic Details
Main Authors: Yuan-Ning Lin, 林苑寧
Other Authors: Jhing-Fa Wang
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/61962122110908352771
id ndltd-TW-096NCKU5442197
record_format oai_dc
spelling ndltd-TW-096NCKU54421972017-07-20T04:35:15Z http://ndltd.ncl.edu.tw/handle/61962122110908352771 On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System 應用SVM與MLLR於多人線上語者調適之泛在語音辨識系統 Yuan-Ning Lin 林苑寧 碩士 國立成功大學 電機工程學系碩博士班 96 Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort target. Currently, most living speech recognition applications need either a speaker independent model or a dependent model with user training at first, such as a voice-activated toy and a mobile phone with voice dialing. We think that the speech recognition with speaker independent model is not convenient enough for home environment with fixed family members. Therefore, this thesis proposes an on-line multi-speaker adaptation based on SVM and MLLR for ubiquitous speech recognition system. This system can not only speech recognizes with the appropriate model for every family member to improve accuracy, but also on-line adapts the acoustic model to be near to the speakers’ characteristic when they use the system. The presented novel architecture can improve the adaptation modeling accuracy of the conventional maximum likelihood linear regression (MLLR) technique. The proposed system contains three phases: the training phase, the recognition phase, and the adaptation phase. First, in the training phase, we generate MLLR regression matrix sets of the Gaussian mixture model (GMM) parameters to construct the eigen-space, and build SVM speaker classification model by a few training data. Next, in the recognition phase, we integrate the speaker independent model with the MLLR transformation matrix set generated from the MLLR transformation matrix set database by the speaker class result of SVM classifier. Then we recognize the test speech by this adapted model. In the adaptation phase, we replace the present MLLR transformation matrix set by the adapted MLLR transformation matrix set merged by weighting from three transformation matrix sets: 1) the present MLLR transformation matrix set; 2) the MLLR regression transformation matrix set estimated from by maximum likelihood from eigenspace and recognition result; and 3) the MLLR regression transformation matrix set adapted by the speech recognition results which are judged by confidence measure to decrease the error training because the noise or the wrong speech recognition. The experimental results show that the proposed method can averagely improve speech recognition accuracy about 3% ~8% with speaker adaptation. Jhing-Fa Wang 王駿發 2008 學位論文 ; thesis 70 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort target. Currently, most living speech recognition applications need either a speaker independent model or a dependent model with user training at first, such as a voice-activated toy and a mobile phone with voice dialing. We think that the speech recognition with speaker independent model is not convenient enough for home environment with fixed family members. Therefore, this thesis proposes an on-line multi-speaker adaptation based on SVM and MLLR for ubiquitous speech recognition system. This system can not only speech recognizes with the appropriate model for every family member to improve accuracy, but also on-line adapts the acoustic model to be near to the speakers’ characteristic when they use the system. The presented novel architecture can improve the adaptation modeling accuracy of the conventional maximum likelihood linear regression (MLLR) technique. The proposed system contains three phases: the training phase, the recognition phase, and the adaptation phase. First, in the training phase, we generate MLLR regression matrix sets of the Gaussian mixture model (GMM) parameters to construct the eigen-space, and build SVM speaker classification model by a few training data. Next, in the recognition phase, we integrate the speaker independent model with the MLLR transformation matrix set generated from the MLLR transformation matrix set database by the speaker class result of SVM classifier. Then we recognize the test speech by this adapted model. In the adaptation phase, we replace the present MLLR transformation matrix set by the adapted MLLR transformation matrix set merged by weighting from three transformation matrix sets: 1) the present MLLR transformation matrix set; 2) the MLLR regression transformation matrix set estimated from by maximum likelihood from eigenspace and recognition result; and 3) the MLLR regression transformation matrix set adapted by the speech recognition results which are judged by confidence measure to decrease the error training because the noise or the wrong speech recognition. The experimental results show that the proposed method can averagely improve speech recognition accuracy about 3% ~8% with speaker adaptation.
author2 Jhing-Fa Wang
author_facet Jhing-Fa Wang
Yuan-Ning Lin
林苑寧
author Yuan-Ning Lin
林苑寧
spellingShingle Yuan-Ning Lin
林苑寧
On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
author_sort Yuan-Ning Lin
title On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_short On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_full On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_fullStr On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_full_unstemmed On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
title_sort on-line multi-speaker adaptation based on svm and mllr for ubiquitous speech recognition system
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/61962122110908352771
work_keys_str_mv AT yuanninglin onlinemultispeakeradaptationbasedonsvmandmllrforubiquitousspeechrecognitionsystem
AT línyuànníng onlinemultispeakeradaptationbasedonsvmandmllrforubiquitousspeechrecognitionsystem
AT yuanninglin yīngyòngsvmyǔmllryúduōrénxiànshàngyǔzhědiàoshìzhīfànzàiyǔyīnbiànshíxìtǒng
AT línyuànníng yīngyòngsvmyǔmllryúduōrénxiànshàngyǔzhědiàoshìzhīfànzàiyǔyīnbiànshíxìtǒng
_version_ 1718502103632052224