Summary: | 碩士 === 國立成功大學 === 電機工程學系 === 102 === This study proposed a low-cost and fast-trainable chip design for automatic speaker-speech recognition (ASSR) system. There are four parts of this proposed system, which is including: feature extraction module, speaker model training module, speaker recognition module, and speech recognition module.
LPCC (Linear Predictive Cepstral Coefficients) is adopted into the proposed feature extraction module. The speech recognition uses dynamic time warping (DTW) to classify the target speech. The novel binary halved clustering (BHC) method uses binary-halved splitting to generate speaker models for low complexity requirement. Compared with the conventional works, simulation results indicate that the proposed hardware accelerator achieves 52% less cost, 68% less responding time, an ASSR accuracy of 90%. This ASSR system to efficiently implement the low cost chip design.
This study has been taped-out in TSMC’s 90nm process. The chip area is 1.47*1.47 mm2, 84-pin package, gate count is 395K, and the power dissipation is 8.74 mW. The operation frequency is 50 MHz, while the Sampling rate is 16 kHz.
|