Real Time Speech Recognition based on PWP Thresholding and MFCC using SVM
The real-time performance of Automatic Speech Recognition (ASR) is a big challenge and needs high computing capability and exhaustive memory consumption. Getting a robust performance against inevitable various difficult situations such as speaker variations, accents, and noise is a tedious task. It’...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
D. G. Pylarinos
2020-08-01
|
Series: | Engineering, Technology & Applied Science Research |
Subjects: | |
Online Access: | http://etasr.com/index.php/ETASR/article/view/3759 |
Summary: | The real-time performance of Automatic Speech Recognition (ASR) is a big challenge and needs high computing capability and exhaustive memory consumption. Getting a robust performance against inevitable various difficult situations such as speaker variations, accents, and noise is a tedious task. It’s crucial to expand new and efficient approaches for speech signal extraction features and pre-processing. In order to fix the high dependency issue related to processing succeeding steps in ARS and enhance the extracted features’ quality, noise robustness can be solved within the ARS extraction block feature, removing implicitly the need for further additional specific compensation parameters or data collection. This paper proposes a new robust acoustic extraction approach development based on a hybrid technique consisting of Perceptual Wavelet Packet (PWP) and Mel Frequency Cepstral Coefficients (MFCCs). The proposed system was implemented on a Rasberry Pi board and its performance was checked in a clean environment, reaching 99% average accuracy. The recognition rate was improved (from 80% to 99%) for the majority of Signal-to-Noise Ratios (SNRs) under real noisy conditions for positive SNRs and considerably improved results especially for negative SNRs.
|
---|---|
ISSN: | 2241-4487 1792-8036 |