Three-dimensional Sound Source Localization

碩士 === 國立臺灣科技大學 === 資訊工程系 === 97 === In this thesis, we study and implement a system to detect the direction of a sound source in three-dimensional space. For the hardware part, an equilateral triangle microphone array composed of only three microphones is used to input the voice signals. For the so...

Full description

Bibliographic Details
Main Authors: Shan-hsiang Yang, 楊善翔
Other Authors: Hung-yan Gu
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/00117977278704040317
id ndltd-TW-097NTUS5392012
record_format oai_dc
spelling ndltd-TW-097NTUS53920122015-10-13T14:49:22Z http://ndltd.ncl.edu.tw/handle/00117977278704040317 Three-dimensional Sound Source Localization 聲源三維方位偵測之研究 Shan-hsiang Yang 楊善翔 碩士 國立臺灣科技大學 資訊工程系 97 In this thesis, we study and implement a system to detect the direction of a sound source in three-dimensional space. For the hardware part, an equilateral triangle microphone array composed of only three microphones is used to input the voice signals. For the software part, VAD (Voice Activity Detection), TDOA (Time Delay of Arrival) estimation and direction detection are executed in order. In the processing of VAD, we propose a spectral-entropy plus SNR-verification based method to distinguish speech/non-speech frames. To estimate TDOA, an approximation algorithm is used to compute a generalized cross correlation function. We propose a synchronous phase replication method to solve the problem of unstable phase. In addition, we propose a parabolic interpolation based method to increase the accuracy of estimated TDOA values. Then, the distances between the estimated vector of TDOA values and the vectors of theoretical value are computed in order to find the direction of a sound source. Also, the accuracy is improved by using interpolation. Furthermore, we propose a weighted voting mechanism to determine the final direction angle from the angles obtained in several speech frames. According to the results of on-line experiments, our system can do real-time processing by using small amount of computations. The averaged error of azimuth angle is 3.43 degrees and the averaged error of elevation is 2.08 degrees. Therefore, the overall performance of our sound source localization system is good. Hung-yan Gu 古鴻炎 2009 學位論文 ; thesis 78 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 資訊工程系 === 97 === In this thesis, we study and implement a system to detect the direction of a sound source in three-dimensional space. For the hardware part, an equilateral triangle microphone array composed of only three microphones is used to input the voice signals. For the software part, VAD (Voice Activity Detection), TDOA (Time Delay of Arrival) estimation and direction detection are executed in order. In the processing of VAD, we propose a spectral-entropy plus SNR-verification based method to distinguish speech/non-speech frames. To estimate TDOA, an approximation algorithm is used to compute a generalized cross correlation function. We propose a synchronous phase replication method to solve the problem of unstable phase. In addition, we propose a parabolic interpolation based method to increase the accuracy of estimated TDOA values. Then, the distances between the estimated vector of TDOA values and the vectors of theoretical value are computed in order to find the direction of a sound source. Also, the accuracy is improved by using interpolation. Furthermore, we propose a weighted voting mechanism to determine the final direction angle from the angles obtained in several speech frames. According to the results of on-line experiments, our system can do real-time processing by using small amount of computations. The averaged error of azimuth angle is 3.43 degrees and the averaged error of elevation is 2.08 degrees. Therefore, the overall performance of our sound source localization system is good.
author2 Hung-yan Gu
author_facet Hung-yan Gu
Shan-hsiang Yang
楊善翔
author Shan-hsiang Yang
楊善翔
spellingShingle Shan-hsiang Yang
楊善翔
Three-dimensional Sound Source Localization
author_sort Shan-hsiang Yang
title Three-dimensional Sound Source Localization
title_short Three-dimensional Sound Source Localization
title_full Three-dimensional Sound Source Localization
title_fullStr Three-dimensional Sound Source Localization
title_full_unstemmed Three-dimensional Sound Source Localization
title_sort three-dimensional sound source localization
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/00117977278704040317
work_keys_str_mv AT shanhsiangyang threedimensionalsoundsourcelocalization
AT yángshànxiáng threedimensionalsoundsourcelocalization
AT shanhsiangyang shēngyuánsānwéifāngwèizhēncèzhīyánjiū
AT yángshànxiáng shēngyuánsānwéifāngwèizhēncèzhīyánjiū
_version_ 1717758428330328064