ROBUST VOICE ACTIVITY DETECTION ALGORITHM BASED ON THE PERCEPTUAL WAVELET PACKET TRANSFORM

碩士 === 樹德科技大學 === 資訊工程學系 === 94 === This thesis presents a voice active detection (VAD) algorithm for Adaptive Multi Rate (AMR) codec. The VAD refers to the ability of distinguishing speech from noise and is required in a variety of speech processing systems. For example, mostly speech coders, e.g....

Full description

Bibliographic Details
Main Authors: Hsin-Te Wu, 吳信德
Other Authors: Shi-Huang Chen
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/13975413305915893915
Description
Summary:碩士 === 樹德科技大學 === 資訊工程學系 === 94 === This thesis presents a voice active detection (VAD) algorithm for Adaptive Multi Rate (AMR) codec. The VAD refers to the ability of distinguishing speech from noise and is required in a variety of speech processing systems. For example, mostly speech coders, e.g. GSM and AMR, have sets of a VAD module. The VAD module also can improve power efficiency and provides a reduction in radiated emissions through discontinuous transmission (DTX). The VAD module of AMR uses a set of method to distinguish speech from noise. The set of method includes background noise estimation, channel energy estimator, and channel SNR estimator. These approaches using pre-defined threshold values for VAD is not suitable and is not appropriate for noisy environments. However, it is difficult to derive a fixed threshold value for accurate VAD under variable pronunciation conditions. Furthermore, the threshold values used in some of traditional VAD algorithms are calculated in the silence intervals and are improper for noisy conditions. A robust VAD therefore should utilize time-varying threshold values to accomplish a better performance. This thesis presents a new VAD algorithm that can overcome the above problems and improve threshold values of consistent accuracy. It is shown in this thesis that the adaptive weighted threshold (AWT) is a robust threshold value for VAD under various noisy environments. One of advantages of this new algorithm is that the pre-defined threshold values are not necessary. In addition, the proposed algorithm can adapt VAD threshold value to variable speech conditions. Experimental results show that the thesis proposes VAD algorithm outperforms the G.729B, and VAD of AMR.