Summary: | 碩士 === 國立成功大學 === 工程科學系 === 107 === Recently most companion robots are designed to interact with people through vision and sound. In this thesis, the author added a sound source recognition system to an existing facial expression recognition robot by using a microphone array. The sound source recognition system consists of two parts, namely sound source localization and sound source separation. The former is achieved by using MUSIC (MUltiple SIgnal Classification) algorithm to estimate the angle of sound source; whereas the latter is by GCC-NMF (Generalized Cross Correlation – Non-Negative Matrix Factorization) algorithm to separate different sound sources. In order to improve the separation accuracy after localization, the author selected appropriate microphone channels via the sound directionality before separation to enhance the separation results.
Since the companion robot aims to serve in small families, the main goal of this study is to treat 2 to 3 sound signals with background noise levels typically in the range of about 45 to 55 dB. The results show that the MUSIC algorithm can estimate the target source accurately, and need less computation time than conventional method, for example, beamforming. As for separation, whether it’s directly listening to audio files or conducting a spectrogram analysis, it all had a significant effect on the results.
|