Three-dimensional Sound Source Localization
碩士 === 國立臺灣科技大學 === 資訊工程系 === 97 === In this thesis, we study and implement a system to detect the direction of a sound source in three-dimensional space. For the hardware part, an equilateral triangle microphone array composed of only three microphones is used to input the voice signals. For the so...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/00117977278704040317 |
id |
ndltd-TW-097NTUS5392012 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097NTUS53920122015-10-13T14:49:22Z http://ndltd.ncl.edu.tw/handle/00117977278704040317 Three-dimensional Sound Source Localization 聲源三維方位偵測之研究 Shan-hsiang Yang 楊善翔 碩士 國立臺灣科技大學 資訊工程系 97 In this thesis, we study and implement a system to detect the direction of a sound source in three-dimensional space. For the hardware part, an equilateral triangle microphone array composed of only three microphones is used to input the voice signals. For the software part, VAD (Voice Activity Detection), TDOA (Time Delay of Arrival) estimation and direction detection are executed in order. In the processing of VAD, we propose a spectral-entropy plus SNR-verification based method to distinguish speech/non-speech frames. To estimate TDOA, an approximation algorithm is used to compute a generalized cross correlation function. We propose a synchronous phase replication method to solve the problem of unstable phase. In addition, we propose a parabolic interpolation based method to increase the accuracy of estimated TDOA values. Then, the distances between the estimated vector of TDOA values and the vectors of theoretical value are computed in order to find the direction of a sound source. Also, the accuracy is improved by using interpolation. Furthermore, we propose a weighted voting mechanism to determine the final direction angle from the angles obtained in several speech frames. According to the results of on-line experiments, our system can do real-time processing by using small amount of computations. The averaged error of azimuth angle is 3.43 degrees and the averaged error of elevation is 2.08 degrees. Therefore, the overall performance of our sound source localization system is good. Hung-yan Gu 古鴻炎 2009 學位論文 ; thesis 78 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊工程系 === 97 === In this thesis, we study and implement a system to detect the direction of a sound source in three-dimensional space. For the hardware part, an equilateral triangle microphone array composed of only three microphones is used to input the voice signals. For the software part, VAD (Voice Activity Detection), TDOA (Time Delay of Arrival) estimation and direction detection are executed in order. In the processing of VAD, we propose a spectral-entropy plus SNR-verification based method to distinguish speech/non-speech frames. To estimate TDOA, an approximation algorithm is used to compute a generalized cross correlation function. We propose a synchronous phase replication method to solve the problem of unstable phase. In addition, we propose a parabolic interpolation based method to increase the accuracy of estimated TDOA values. Then, the distances between the estimated vector of TDOA values and the vectors of theoretical value are computed in order to find the direction of a sound source. Also, the accuracy is improved by using interpolation. Furthermore, we propose a weighted voting mechanism to determine the final direction angle from the angles obtained in several speech frames. According to the results of on-line experiments, our system can do real-time processing by using small amount of computations. The averaged error of azimuth angle is 3.43 degrees and the averaged error of elevation is 2.08 degrees. Therefore, the overall performance of our sound source localization system is good.
|
author2 |
Hung-yan Gu |
author_facet |
Hung-yan Gu Shan-hsiang Yang 楊善翔 |
author |
Shan-hsiang Yang 楊善翔 |
spellingShingle |
Shan-hsiang Yang 楊善翔 Three-dimensional Sound Source Localization |
author_sort |
Shan-hsiang Yang |
title |
Three-dimensional Sound Source Localization |
title_short |
Three-dimensional Sound Source Localization |
title_full |
Three-dimensional Sound Source Localization |
title_fullStr |
Three-dimensional Sound Source Localization |
title_full_unstemmed |
Three-dimensional Sound Source Localization |
title_sort |
three-dimensional sound source localization |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/00117977278704040317 |
work_keys_str_mv |
AT shanhsiangyang threedimensionalsoundsourcelocalization AT yángshànxiáng threedimensionalsoundsourcelocalization AT shanhsiangyang shēngyuánsānwéifāngwèizhēncèzhīyánjiū AT yángshànxiáng shēngyuánsānwéifāngwèizhēncèzhīyánjiū |
_version_ |
1717758428330328064 |