Summary: | 碩士 === 國立清華大學 === 電機工程學系 === 103 === In a real environment, sound sources are coupled to the microphones by convolution with room responses. It is difficult and time-consuming to deal with source separation in the time domain. Existing approaches deal with source separation by converting the mixed signals to the time-frequency domain by short-time Fourier transform (STFT). Then, Independent Component Analysis (ICA) is applied in each frequency bin to separate the sources, however, the drawbacks for this particular method were the scaling problem and the permutation problem. Among these two problems, the permutation problem is much more difficult to resolve and it is also the focus of this thesis. Based on the assumption that the correlations should be high between the temporal envelopes of neighboring frequencies from the same sound source, we have developed an algorithm to solve the permutation problem. After solving the scaling problem and the permutation problem, the separated signals are converted to the time domain by inverse short-time Fourier transform (ISTFT) to complete the separation. In experiment 1 to 4, the sound sources were obtained by recording in the room, and by using the steps above to acquire the separated signals. The effectiveness of the algorithm were assessed by subjective and objective measures. From the results of experiment 1 to 4, the sound sources which is labeled as 1-4 are rated by the participants with an average score higher than 4.18 out of 5. In experiment 5, we compared the method from our thesis to the methods from [23] and [25], and the present method improves the source to interferences ratio (SIR) by 3.1 dB. The results of experiments have shown that the method of our thesis was able to effectively solve the permutation problem, and improve the separation performance.
|