KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition
碩士 === 國立虎尾科技大學 === 電機工程研究所 === 103 === The dissertation is regarding the development of KINECT combining an speech recognition of the speaker location, as well as using the KINECT microphone array device developed by Microsoft. With the audio signal provided by microphone array and the basis of the...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/5sanwr |
id |
ndltd-TW-103NYPI5441009 |
---|---|
record_format |
oai_dc |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立虎尾科技大學 === 電機工程研究所 === 103 === The dissertation is regarding the development of KINECT combining an speech recognition of the speaker location, as well as using the KINECT microphone array device developed by Microsoft. With the audio signal provided by microphone array and the basis of the speaker location, the speech recognition system is developed with the positioning efficiency.
The dissertation is using KINECT microphone array speaker position, speaker location is divided into two, the first used by Microsoft of open source Software Development Kit, the method limits the detection range of angles, and distances can''t be calculated, therefore, we proposed another method using Time Difference of Arrival, the method of detection angle is wider than KINECT SDK , and can calculate the distance.
The dissertation is using KINECT microphone array-Based speaker localization for speech pattern recognition, the first one is the KINECT microphone array-based Speaker Localization for speaker verification, the second one is the KINECT microphone array-based Speaker Localization for voice recognition in order to improve recognition rate , the research added Type-1 Fuzzy Model and Type-2 Fuzzy Model for further identification system performance improvements.
In the KINECT microphone array speaker localization for speaker verification, speaker verification method of this research using Support Vector Machine, the research of this method using Support Vector Machine, using decision fusion method approach to development and proposed three speaker verification method, method one for introducing Type-1 Fuzzy Systems in KINECT SDK microphone array speaker verification, fuzzy system input parameters for KINECT SDK the calculated angle, method two for introducing Type-1 Fuzzy Systems in KINECT TDOA microphone array speaker verification, fuzzy system input parameters for KINECT TDOA the calculated angle and distance, method three for introducing Type-2 Fuzzy Systems in KINECT TDOA microphone array speaker verification, fuzzy system input parameters for KINECT TDOA the calculated angle.
In the KINECT microphone array speaker localization for voice recognition, voice recognition method of this research using Dynamics Time Warping, using decision fusion method approach to development and proposed three voice recognition method, method one for introducing Type-1 Fuzzy Systems in KINECT SDK microphone array voice recognition, fuzzy system input parameters for KINECT SDK the calculated angle, method two for introducing Type-1 Fuzzy Systems in KINECT TDOA microphone array voice recognition, fuzzy system input parameters for KINECT TDOA the calculated angle and distance, method three for introducing Type-2 Fuzzy Systems in KINECT TDOA microphone array voice recognition, fuzzy system input parameters for KINECT TDOA the calculated angle.
The dissertation to further using TDOA speaker Location while adjusting SVM parameters C and parameter γ on SVM speaker verification, proposed to fuzzy system input parameters for KINECT TDOA the calculated angle and distance while the decision parameters C and γ.
The efficiency of the SVM speaker verification rose from 52.5%, using the single microphone, to 62.5%, using the fusion method of KINECT SDK. The DTW speech recognition rose from 57.6%, using the single microphone, to 79.2%, using the fusion method of the KINECT SDK. The research proposed to KINECT microphone array speaker localization for speaker verification on study proposes three methods, method one the Introducing Type-1 fuzzy model using the fusion method of KINECT SDK. The SVM speaker verification rose from 88.99%, method two the Introducing Type-1 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 90.99%,method three the Introducing Type-2 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 93.99%,The research proposed to KINECT microphone array voice recognition for voice recognition on study proposes three methods, method one the Introducing Type-1 fuzzy model using the fusion method of KINECT SDK. The SVM speaker verification rose from 84.62%, method two the Introducing Type-1 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 91.4%,method three the Introducing Type-2 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 92.52%, and using TDOA speaker Location while adjusting SVM parameters C and parameter γ on SVM speaker verification, the method is 83.16% average recognition rates, the average recognition rate compared to the traditional list approach to high. From the experimental results, the proposed method exactly improved the recognition and is better than the previous using the original method.
|
author2 |
丁英智 |
author_facet |
丁英智 Jia-Yi Shih 施家逸 |
author |
Jia-Yi Shih 施家逸 |
spellingShingle |
Jia-Yi Shih 施家逸 KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition |
author_sort |
Jia-Yi Shih |
title |
KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition |
title_short |
KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition |
title_full |
KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition |
title_fullStr |
KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition |
title_full_unstemmed |
KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition |
title_sort |
kinect microphone array-based speaker localization for speech pattern recognition |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/5sanwr |
work_keys_str_mv |
AT jiayishih kinectmicrophonearraybasedspeakerlocalizationforspeechpatternrecognition AT shījiāyì kinectmicrophonearraybasedspeakerlocalizationforspeechpatternrecognition AT jiayishih jiéhékinectmàikèfēngzhènlièzhīyǔzhědìngwèideyǔyīnmóyàngbiànshíyánjiū AT shījiāyì jiéhékinectmàikèfēngzhènlièzhīyǔzhědìngwèideyǔyīnmóyàngbiànshíyánjiū |
_version_ |
1719253485162070016 |
spelling |
ndltd-TW-103NYPI54410092019-09-21T03:32:36Z http://ndltd.ncl.edu.tw/handle/5sanwr KINECT Microphone Array-Based Speaker Localization for Speech Pattern Recognition 結合KINECT麥克風陣列之語者定位的語音模樣辨識研究 Jia-Yi Shih 施家逸 碩士 國立虎尾科技大學 電機工程研究所 103 The dissertation is regarding the development of KINECT combining an speech recognition of the speaker location, as well as using the KINECT microphone array device developed by Microsoft. With the audio signal provided by microphone array and the basis of the speaker location, the speech recognition system is developed with the positioning efficiency. The dissertation is using KINECT microphone array speaker position, speaker location is divided into two, the first used by Microsoft of open source Software Development Kit, the method limits the detection range of angles, and distances can''t be calculated, therefore, we proposed another method using Time Difference of Arrival, the method of detection angle is wider than KINECT SDK , and can calculate the distance. The dissertation is using KINECT microphone array-Based speaker localization for speech pattern recognition, the first one is the KINECT microphone array-based Speaker Localization for speaker verification, the second one is the KINECT microphone array-based Speaker Localization for voice recognition in order to improve recognition rate , the research added Type-1 Fuzzy Model and Type-2 Fuzzy Model for further identification system performance improvements. In the KINECT microphone array speaker localization for speaker verification, speaker verification method of this research using Support Vector Machine, the research of this method using Support Vector Machine, using decision fusion method approach to development and proposed three speaker verification method, method one for introducing Type-1 Fuzzy Systems in KINECT SDK microphone array speaker verification, fuzzy system input parameters for KINECT SDK the calculated angle, method two for introducing Type-1 Fuzzy Systems in KINECT TDOA microphone array speaker verification, fuzzy system input parameters for KINECT TDOA the calculated angle and distance, method three for introducing Type-2 Fuzzy Systems in KINECT TDOA microphone array speaker verification, fuzzy system input parameters for KINECT TDOA the calculated angle. In the KINECT microphone array speaker localization for voice recognition, voice recognition method of this research using Dynamics Time Warping, using decision fusion method approach to development and proposed three voice recognition method, method one for introducing Type-1 Fuzzy Systems in KINECT SDK microphone array voice recognition, fuzzy system input parameters for KINECT SDK the calculated angle, method two for introducing Type-1 Fuzzy Systems in KINECT TDOA microphone array voice recognition, fuzzy system input parameters for KINECT TDOA the calculated angle and distance, method three for introducing Type-2 Fuzzy Systems in KINECT TDOA microphone array voice recognition, fuzzy system input parameters for KINECT TDOA the calculated angle. The dissertation to further using TDOA speaker Location while adjusting SVM parameters C and parameter γ on SVM speaker verification, proposed to fuzzy system input parameters for KINECT TDOA the calculated angle and distance while the decision parameters C and γ. The efficiency of the SVM speaker verification rose from 52.5%, using the single microphone, to 62.5%, using the fusion method of KINECT SDK. The DTW speech recognition rose from 57.6%, using the single microphone, to 79.2%, using the fusion method of the KINECT SDK. The research proposed to KINECT microphone array speaker localization for speaker verification on study proposes three methods, method one the Introducing Type-1 fuzzy model using the fusion method of KINECT SDK. The SVM speaker verification rose from 88.99%, method two the Introducing Type-1 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 90.99%,method three the Introducing Type-2 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 93.99%,The research proposed to KINECT microphone array voice recognition for voice recognition on study proposes three methods, method one the Introducing Type-1 fuzzy model using the fusion method of KINECT SDK. The SVM speaker verification rose from 84.62%, method two the Introducing Type-1 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 91.4%,method three the Introducing Type-2 fuzzy model using the fusion method of KINECT TDOA. The SVM speaker verification rose from 92.52%, and using TDOA speaker Location while adjusting SVM parameters C and parameter γ on SVM speaker verification, the method is 83.16% average recognition rates, the average recognition rate compared to the traditional list approach to high. From the experimental results, the proposed method exactly improved the recognition and is better than the previous using the original method. 丁英智 2015 學位論文 ; thesis 80 zh-TW |