On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System
碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/61962122110908352771 |
id |
ndltd-TW-096NCKU5442197 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096NCKU54421972017-07-20T04:35:15Z http://ndltd.ncl.edu.tw/handle/61962122110908352771 On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System 應用SVM與MLLR於多人線上語者調適之泛在語音辨識系統 Yuan-Ning Lin 林苑寧 碩士 國立成功大學 電機工程學系碩博士班 96 Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort target. Currently, most living speech recognition applications need either a speaker independent model or a dependent model with user training at first, such as a voice-activated toy and a mobile phone with voice dialing. We think that the speech recognition with speaker independent model is not convenient enough for home environment with fixed family members. Therefore, this thesis proposes an on-line multi-speaker adaptation based on SVM and MLLR for ubiquitous speech recognition system. This system can not only speech recognizes with the appropriate model for every family member to improve accuracy, but also on-line adapts the acoustic model to be near to the speakers’ characteristic when they use the system. The presented novel architecture can improve the adaptation modeling accuracy of the conventional maximum likelihood linear regression (MLLR) technique. The proposed system contains three phases: the training phase, the recognition phase, and the adaptation phase. First, in the training phase, we generate MLLR regression matrix sets of the Gaussian mixture model (GMM) parameters to construct the eigen-space, and build SVM speaker classification model by a few training data. Next, in the recognition phase, we integrate the speaker independent model with the MLLR transformation matrix set generated from the MLLR transformation matrix set database by the speaker class result of SVM classifier. Then we recognize the test speech by this adapted model. In the adaptation phase, we replace the present MLLR transformation matrix set by the adapted MLLR transformation matrix set merged by weighting from three transformation matrix sets: 1) the present MLLR transformation matrix set; 2) the MLLR regression transformation matrix set estimated from by maximum likelihood from eigenspace and recognition result; and 3) the MLLR regression transformation matrix set adapted by the speech recognition results which are judged by confidence measure to decrease the error training because the noise or the wrong speech recognition. The experimental results show that the proposed method can averagely improve speech recognition accuracy about 3% ~8% with speaker adaptation. Jhing-Fa Wang 王駿發 2008 學位論文 ; thesis 70 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Technology always comes from human nature. The growing popularity of speech recognition applicants in living still has great room for improvement. How to improve the speech recognition technology and bring more convenient for people is our continuously effort target.
Currently, most living speech recognition applications need either a speaker independent model or a dependent model with user training at first, such as a voice-activated toy and a mobile phone with voice dialing. We think that the speech recognition with speaker independent model is not convenient enough for home environment with fixed family members. Therefore, this thesis proposes an on-line multi-speaker adaptation based on SVM and MLLR for ubiquitous speech recognition system. This system can not only speech recognizes with the appropriate model for every family member to improve accuracy, but also on-line adapts the acoustic model to be near to the speakers’ characteristic when they use the system.
The presented novel architecture can improve the adaptation modeling accuracy of the conventional maximum likelihood linear regression (MLLR) technique. The proposed system contains three phases: the training phase, the recognition phase, and the adaptation phase. First, in the training phase, we generate MLLR regression matrix sets of the Gaussian mixture model (GMM) parameters to construct the eigen-space, and build SVM speaker classification model by a few training data. Next, in the recognition phase, we integrate the speaker independent model with the MLLR transformation matrix set generated from the MLLR transformation matrix set database by the speaker class result of SVM classifier. Then we recognize the test speech by this adapted model. In the adaptation phase, we replace the present MLLR transformation matrix set by the adapted MLLR transformation matrix set merged by weighting from three transformation matrix sets: 1) the present MLLR transformation matrix set; 2) the MLLR regression transformation matrix set estimated from by maximum likelihood from eigenspace and recognition result; and 3) the MLLR regression transformation matrix set adapted by the speech recognition results which are judged by confidence measure to decrease the error training because the noise or the wrong speech recognition.
The experimental results show that the proposed method can averagely improve speech recognition accuracy about 3% ~8% with speaker adaptation.
|
author2 |
Jhing-Fa Wang |
author_facet |
Jhing-Fa Wang Yuan-Ning Lin 林苑寧 |
author |
Yuan-Ning Lin 林苑寧 |
spellingShingle |
Yuan-Ning Lin 林苑寧 On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System |
author_sort |
Yuan-Ning Lin |
title |
On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System |
title_short |
On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System |
title_full |
On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System |
title_fullStr |
On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System |
title_full_unstemmed |
On-Line Multi-Speaker Adaptation Based on SVM and MLLR for Ubiquitous Speech Recognition System |
title_sort |
on-line multi-speaker adaptation based on svm and mllr for ubiquitous speech recognition system |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/61962122110908352771 |
work_keys_str_mv |
AT yuanninglin onlinemultispeakeradaptationbasedonsvmandmllrforubiquitousspeechrecognitionsystem AT línyuànníng onlinemultispeakeradaptationbasedonsvmandmllrforubiquitousspeechrecognitionsystem AT yuanninglin yīngyòngsvmyǔmllryúduōrénxiànshàngyǔzhědiàoshìzhīfànzàiyǔyīnbiànshíxìtǒng AT línyuànníng yīngyòngsvmyǔmllryúduōrénxiànshàngyǔzhědiàoshìzhīfànzàiyǔyīnbiànshíxìtǒng |
_version_ |
1718502103632052224 |