Real Time Speech Recognition based on PWP Thresholding and MFCC using SVM

The real-time performance of Automatic Speech Recognition (ASR) is a big challenge and needs high computing capability and exhaustive memory consumption. Getting a robust performance against inevitable various difficult situations such as speaker variations, accents, and noise is a tedious task. It’...

Full description

Bibliographic Details
Main Authors: W. Helali, Ζ. Hajaiej, A. Cherif
Format: Article
Language:English
Published: D. G. Pylarinos 2020-08-01
Series:Engineering, Technology & Applied Science Research
Subjects:
SVM
Online Access:http://etasr.com/index.php/ETASR/article/view/3759
Description
Summary:The real-time performance of Automatic Speech Recognition (ASR) is a big challenge and needs high computing capability and exhaustive memory consumption. Getting a robust performance against inevitable various difficult situations such as speaker variations, accents, and noise is a tedious task. It’s crucial to expand new and efficient approaches for speech signal extraction features and pre-processing. In order to fix the high dependency issue related to processing succeeding steps in ARS and enhance the extracted features’ quality, noise robustness can be solved within the ARS extraction block feature, removing implicitly the need for further additional specific compensation parameters or data collection. This paper proposes a new robust acoustic extraction approach development based on a hybrid technique consisting of Perceptual Wavelet Packet (PWP) and Mel Frequency Cepstral Coefficients (MFCCs). The proposed system was implemented on a Rasberry Pi board and its performance was checked in a clean environment, reaching 99% average accuracy. The recognition rate was improved (from 80% to 99%) for the majority of Signal-to-Noise Ratios (SNRs) under real noisy conditions for positive SNRs and considerably improved results especially for negative SNRs.
ISSN:2241-4487
1792-8036