Design Time Domain Filter Banks Using Least Squares Method to Calculate the Mel-Frequency Cepstral Coefficients for Speaker Recognition

碩士 === 中華技術學院 === 電子工程研究所碩士班 === 96 === Up to now, the best speaker recognition technique is based on Mel-frequency cepstral coefficients (MFCCs) [1-4,11] method. The main procedures on taking MFCCs are undergone by: framing, Hamming windowing, multiplied by FFT(Fast Fourier Transform)[7], filtered...

Full description

Bibliographic Details
Main Authors: Sunrise Wu, 吳尚叡
Other Authors: Wu-Ton Chen
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/08178129842426697899
Description
Summary:碩士 === 中華技術學院 === 電子工程研究所碩士班 === 96 === Up to now, the best speaker recognition technique is based on Mel-frequency cepstral coefficients (MFCCs) [1-4,11] method. The main procedures on taking MFCCs are undergone by: framing, Hamming windowing, multiplied by FFT(Fast Fourier Transform)[7], filtered by Mel-scale triangular filter bank, taken the logarithmic energies of outputs, and transformed by DCT (Discrete Cosine Transform)[1-8]. After these processes, the MFCCs are obtained. The main topic of this thesis is we replace previous procedures of FFT [7] and filtering using a frequency-domain Mel-scale triangular filter bank[15] by filtering using a time-domain Mel-scale triangular filter bank. The time-domain Mel-scale triangular filter bank[1-8,14] we mentioned is obtained by the least square method[10,13], which is used to obtain the Mel-frequency cepstral coefficients of speaker speeches. From the results of our experiments, we find that the successful speaker recognition ratios between the conventional MFCC method[2,3,6,14] and our new approach are very similar.