Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment

This paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coeffic...

Full description

Bibliographic Details
Main Authors: J. Uhlir, P. Sovka, J. Novotny
Format: Article
Language:English
Published: Spolecnost pro radioelektronicke inzenyrstvi 2004-04-01
Series:Radioengineering
Online Access:http://www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf
id doaj-2b53b99a83444a95b03c1475133db3ab
record_format Article
spelling doaj-2b53b99a83444a95b03c1475133db3ab2020-11-24T21:01:59ZengSpolecnost pro radioelektronicke inzenyrstviRadioengineering1210-25122004-04-0113117Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy EnvironmentJ. UhlirP. SovkaJ. NovotnyThis paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coefficients analysis of speech (MFCC). The mainaim is to analyze and to search for the optimal settings of SCRS withrespect to additive noise robustness without use of additionaltechniques for additive noise reduction. The analysis is pointed to theappropriate setting of MFCC computation, the silence model adjustmentand grammar selection possibilities. It is shown, that the correctperformance of SCRS strictly depends on an appropriate adjustment ofthe silence model. The ability of the silence model adaptation isconfirmed. When SNR is higher than 15 dB the suitable performance ofSCRS can be guarantied without any modification of the triphones speechmodels by: 1. the optimal setting of MFCC computation, 2. the propersilence model adaptation. The assumption of a speech commandrecognition system use in an environment where SNR is higher than 15 dBis fulfilled in many applications.www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf
collection DOAJ
language English
format Article
sources DOAJ
author J. Uhlir
P. Sovka
J. Novotny
spellingShingle J. Uhlir
P. Sovka
J. Novotny
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
Radioengineering
author_facet J. Uhlir
P. Sovka
J. Novotny
author_sort J. Uhlir
title Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_short Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_full Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_fullStr Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_full_unstemmed Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_sort analysis and optimization of telephone speech command recognition system performance in noisy environment
publisher Spolecnost pro radioelektronicke inzenyrstvi
series Radioengineering
issn 1210-2512
publishDate 2004-04-01
description This paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coefficients analysis of speech (MFCC). The mainaim is to analyze and to search for the optimal settings of SCRS withrespect to additive noise robustness without use of additionaltechniques for additive noise reduction. The analysis is pointed to theappropriate setting of MFCC computation, the silence model adjustmentand grammar selection possibilities. It is shown, that the correctperformance of SCRS strictly depends on an appropriate adjustment ofthe silence model. The ability of the silence model adaptation isconfirmed. When SNR is higher than 15 dB the suitable performance ofSCRS can be guarantied without any modification of the triphones speechmodels by: 1. the optimal setting of MFCC computation, 2. the propersilence model adaptation. The assumption of a speech commandrecognition system use in an environment where SNR is higher than 15 dBis fulfilled in many applications.
url http://www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf
work_keys_str_mv AT juhlir analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment
AT psovka analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment
AT jnovotny analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment
_version_ 1716777020723036160