Two - Features Voice Activity Detection and Transient Noise Classification in Low SNR Environment

碩士 === 國立宜蘭大學 === 電子工程學系碩士班 === 102 === In recent years, smart home appliances and mobile devices have increasingly become prevalent. This study is focused on the smart TV structure, especially the front-end voice signal processing technology. It aims at developing an efficient voice activity detect...

Full description

Bibliographic Details
Main Authors: Chen, Szu-Hong, 陳思宏
Other Authors: Hu, Hwai-Tsu
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/50557572767591710285
Description
Summary:碩士 === 國立宜蘭大學 === 電子工程學系碩士班 === 102 === In recent years, smart home appliances and mobile devices have increasingly become prevalent. This study is focused on the smart TV structure, especially the front-end voice signal processing technology. It aims at developing an efficient voice activity detection (VAD) algorithm to discriminate the voice from noise via noise cancellation and support vector machine (SVM) techniques, thus providing preferable voice signals for follow-up applications. This thesis mainly contains three parts. First, two voice-related parameters, namely frame energy and spectral entropy, are employed as the basis of judgment. These two parameters have value fluctuation, and are unstable in signal analysis, causing incorrect judgment in VAD. Thus, they are combined into one vector to control the fluctuation. This can facilitate judgment in the VAD. As a result, voiced and silence frames can be distinguished effectively. Second, in order to ensure that the voice application has a better performance, noise must be canceled. Thus, the results of the VAD are directly utilized to identify silence frames that consist of merely background noise for deriving the adaptive noise cancellation filter. Third, because the separation between the transient noise and voice signal is difficult for conventional VAD algorithms, this study resorts to the learning and classification capability of the SVM to distinguish the transient noise from voice signals. In the experimental setup we employed the white and babble noise samples as the background noise together with four types of transient noise. The efficiency of the VAD was evaluated based on the detection accuracy and perceptual evaluation of speech quality (PESQ) measures. The discussion thus included the effect due to background noise cancellation as well as SVM classification. The experimental results showed that the proposed two-parameter VAD algorithm renders a better performance when the babble noise is present with low signal-to-noise ratios. In comparison with the compared object, the computational complexity of the two-parameter algorithm is relatively low and therefore suitable for small mobile devices. As also revealed by the experimental results, the background noise has significant influence on classification results, suggesting that cancelling background noise will avail the performance improvement of subsequent applications.