Summary: | 碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 101 === In this thesis, we propose a pitch based noise reduction (NR) system and a VAD-based wide dynamic range compression (WDRC) which adopts a quasi-ANSI 1/3 octave filter bank with low group delay for realistic implementation in hearing aids (HA) systems. The proposed pitch based NR includes a pitch based voice activity detection (VAD) and onset-depended noise attenuation (ONA). The characteristics of speech such as pitch and corresponding harmonics, onset, and time of monosyllable word length are utilized by the proposed pitch based NR. Due to the drawback of low resolution resulted from quasi ASNI filter bank, the proposed pitch based VAD integrates the pitch and onset features with the flexible harmonics detection to improve the accuracy of VAD. The proposed ONA is designed to conquer the poor resolution of the filter bank. In addition, an update mechanism of long-term average magnitude is employed to enhance the detection of onset feature. The simulation results show that the proposed pitch based NR can perform well in both stationary (the situation that user is still) background noise environment and highly dynamic (the situation that user is moving) background noise environment. The accuracy results of proposed pitch based VAD are comparable with the pitch based VAD adopting ANSI filter bank which has high resolution. The average accuracy of proposed pitch based VAD is about 83.70% and 85.70% in stationary and dynamic noise situations respectively. And the average improvement of segmental signal-noise-ratio (SNRseg) and signal-noise-ratio (SNR) of the proposed ONA is 5.95dB and 9.12dB in stationary noise environment and 6.49dB and 9.47dB in dynamic noise environment. Moreover, the average improvement of sound quality (PESQ) is 0.19 and 0.22 in stationary and dynamic noise environments respectively.
The proposed VAD-based WDRC enhances the energy difference between speech and noise. Because the WDRC algorithms are usually developed on clean speech scenarios without considering the presence of background noise, the high energy of speech may be suppressed more than low energy of background noise due to the characteristic of WDRC. This incurs the undesired interaction effect when NR and WDRC are connected. The performance of NR might be degraded by WDRC block. Thus, the energy difference between speech and noise is decreased and degrades the speech intelligibility. With the help of VAD information from NR block, WDRC can perform different operations to speech regions and noise regions and increases the speech intelligibility. The simulation results show that the proposed VAD-based WDRC has benefit to reduce the undesired interaction effect between NR and WDRC.
For the proposed pitch based NR and VAD-based WDRC, the computational complexity of the proposed algorithms is low and the slight cost of modifications could exchange the outstanding performance. Finally, the total latency of the proposed algorithm including the quasi ANSI filter bank is only 11.3ms which matches the requirement of HA system and is suitable for the HA applications.
|