Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System
碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 101 === In this thesis, we propose a pitch based noise reduction (NR) system and a VAD-based wide dynamic range compression (WDRC) which adopts a quasi-ANSI 1/3 octave filter bank with low group delay for realistic implementation in hearing aids (HA) systems. The p...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/68460477876351318822 |
id |
ndltd-TW-101NCTU5428222 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101NCTU54282222015-10-13T23:16:04Z http://ndltd.ncl.edu.tw/handle/68460477876351318822 Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System 適用於華語數位助聽器之低延遲且類ANSI S1.11 1/3-octave規範濾波器組的音高式噪音消除與語音偵測輔助之廣泛動態範圍壓縮技術設計 Huang, Yi-Cheng 黃義政 碩士 國立交通大學 電子工程學系 電子研究所 101 In this thesis, we propose a pitch based noise reduction (NR) system and a VAD-based wide dynamic range compression (WDRC) which adopts a quasi-ANSI 1/3 octave filter bank with low group delay for realistic implementation in hearing aids (HA) systems. The proposed pitch based NR includes a pitch based voice activity detection (VAD) and onset-depended noise attenuation (ONA). The characteristics of speech such as pitch and corresponding harmonics, onset, and time of monosyllable word length are utilized by the proposed pitch based NR. Due to the drawback of low resolution resulted from quasi ASNI filter bank, the proposed pitch based VAD integrates the pitch and onset features with the flexible harmonics detection to improve the accuracy of VAD. The proposed ONA is designed to conquer the poor resolution of the filter bank. In addition, an update mechanism of long-term average magnitude is employed to enhance the detection of onset feature. The simulation results show that the proposed pitch based NR can perform well in both stationary (the situation that user is still) background noise environment and highly dynamic (the situation that user is moving) background noise environment. The accuracy results of proposed pitch based VAD are comparable with the pitch based VAD adopting ANSI filter bank which has high resolution. The average accuracy of proposed pitch based VAD is about 83.70% and 85.70% in stationary and dynamic noise situations respectively. And the average improvement of segmental signal-noise-ratio (SNRseg) and signal-noise-ratio (SNR) of the proposed ONA is 5.95dB and 9.12dB in stationary noise environment and 6.49dB and 9.47dB in dynamic noise environment. Moreover, the average improvement of sound quality (PESQ) is 0.19 and 0.22 in stationary and dynamic noise environments respectively. The proposed VAD-based WDRC enhances the energy difference between speech and noise. Because the WDRC algorithms are usually developed on clean speech scenarios without considering the presence of background noise, the high energy of speech may be suppressed more than low energy of background noise due to the characteristic of WDRC. This incurs the undesired interaction effect when NR and WDRC are connected. The performance of NR might be degraded by WDRC block. Thus, the energy difference between speech and noise is decreased and degrades the speech intelligibility. With the help of VAD information from NR block, WDRC can perform different operations to speech regions and noise regions and increases the speech intelligibility. The simulation results show that the proposed VAD-based WDRC has benefit to reduce the undesired interaction effect between NR and WDRC. For the proposed pitch based NR and VAD-based WDRC, the computational complexity of the proposed algorithms is low and the slight cost of modifications could exchange the outstanding performance. Finally, the total latency of the proposed algorithm including the quasi ANSI filter bank is only 11.3ms which matches the requirement of HA system and is suitable for the HA applications. Jou, Shyh-Jye 周世傑 2013 學位論文 ; thesis 102 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 101 === In this thesis, we propose a pitch based noise reduction (NR) system and a VAD-based wide dynamic range compression (WDRC) which adopts a quasi-ANSI 1/3 octave filter bank with low group delay for realistic implementation in hearing aids (HA) systems. The proposed pitch based NR includes a pitch based voice activity detection (VAD) and onset-depended noise attenuation (ONA). The characteristics of speech such as pitch and corresponding harmonics, onset, and time of monosyllable word length are utilized by the proposed pitch based NR. Due to the drawback of low resolution resulted from quasi ASNI filter bank, the proposed pitch based VAD integrates the pitch and onset features with the flexible harmonics detection to improve the accuracy of VAD. The proposed ONA is designed to conquer the poor resolution of the filter bank. In addition, an update mechanism of long-term average magnitude is employed to enhance the detection of onset feature. The simulation results show that the proposed pitch based NR can perform well in both stationary (the situation that user is still) background noise environment and highly dynamic (the situation that user is moving) background noise environment. The accuracy results of proposed pitch based VAD are comparable with the pitch based VAD adopting ANSI filter bank which has high resolution. The average accuracy of proposed pitch based VAD is about 83.70% and 85.70% in stationary and dynamic noise situations respectively. And the average improvement of segmental signal-noise-ratio (SNRseg) and signal-noise-ratio (SNR) of the proposed ONA is 5.95dB and 9.12dB in stationary noise environment and 6.49dB and 9.47dB in dynamic noise environment. Moreover, the average improvement of sound quality (PESQ) is 0.19 and 0.22 in stationary and dynamic noise environments respectively.
The proposed VAD-based WDRC enhances the energy difference between speech and noise. Because the WDRC algorithms are usually developed on clean speech scenarios without considering the presence of background noise, the high energy of speech may be suppressed more than low energy of background noise due to the characteristic of WDRC. This incurs the undesired interaction effect when NR and WDRC are connected. The performance of NR might be degraded by WDRC block. Thus, the energy difference between speech and noise is decreased and degrades the speech intelligibility. With the help of VAD information from NR block, WDRC can perform different operations to speech regions and noise regions and increases the speech intelligibility. The simulation results show that the proposed VAD-based WDRC has benefit to reduce the undesired interaction effect between NR and WDRC.
For the proposed pitch based NR and VAD-based WDRC, the computational complexity of the proposed algorithms is low and the slight cost of modifications could exchange the outstanding performance. Finally, the total latency of the proposed algorithm including the quasi ANSI filter bank is only 11.3ms which matches the requirement of HA system and is suitable for the HA applications.
|
author2 |
Jou, Shyh-Jye |
author_facet |
Jou, Shyh-Jye Huang, Yi-Cheng 黃義政 |
author |
Huang, Yi-Cheng 黃義政 |
spellingShingle |
Huang, Yi-Cheng 黃義政 Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System |
author_sort |
Huang, Yi-Cheng |
title |
Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System |
title_short |
Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System |
title_full |
Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System |
title_fullStr |
Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System |
title_full_unstemmed |
Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System |
title_sort |
design of pitch based noise reduction adopting low latency quasi ansi s1.11 1/3 octave filter bank and vad-based wide dynamic range compression for mandarin digital hearing aid system |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/68460477876351318822 |
work_keys_str_mv |
AT huangyicheng designofpitchbasednoisereductionadoptinglowlatencyquasiansis11113octavefilterbankandvadbasedwidedynamicrangecompressionformandarindigitalhearingaidsystem AT huángyìzhèng designofpitchbasednoisereductionadoptinglowlatencyquasiansis11113octavefilterbankandvadbasedwidedynamicrangecompressionformandarindigitalhearingaidsystem AT huangyicheng shìyòngyúhuáyǔshùwèizhùtīngqìzhīdīyánchíqiělèiansis11113octaveguīfànlǜbōqìzǔdeyīngāoshìzàoyīnxiāochúyǔyǔyīnzhēncèfǔzhùzhīguǎngfàndòngtàifànwéiyāsuōjìshùshèjì AT huángyìzhèng shìyòngyúhuáyǔshùwèizhùtīngqìzhīdīyánchíqiělèiansis11113octaveguīfànlǜbōqìzǔdeyīngāoshìzàoyīnxiāochúyǔyǔyīnzhēncèfǔzhùzhīguǎngfàndòngtàifànwéiyāsuōjìshùshèjì |
_version_ |
1718084927506874368 |