Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment

This paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coeffic...

Full description

Bibliographic Details
Main Authors:	J. Uhlir, P. Sovka, J. Novotny
Format:	Article
Language:	English
Published:	Spolecnost pro radioelektronicke inzenyrstvi 2004-04-01
Series:	Radioengineering
Online Access:	http://www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf

id	doaj-2b53b99a83444a95b03c1475133db3ab
record_format	Article
spelling	doaj-2b53b99a83444a95b03c1475133db3ab2020-11-24T21:01:59ZengSpolecnost pro radioelektronicke inzenyrstviRadioengineering1210-25122004-04-0113117Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy EnvironmentJ. UhlirP. SovkaJ. NovotnyThis paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coefficients analysis of speech (MFCC). The mainaim is to analyze and to search for the optimal settings of SCRS withrespect to additive noise robustness without use of additionaltechniques for additive noise reduction. The analysis is pointed to theappropriate setting of MFCC computation, the silence model adjustmentand grammar selection possibilities. It is shown, that the correctperformance of SCRS strictly depends on an appropriate adjustment ofthe silence model. The ability of the silence model adaptation isconfirmed. When SNR is higher than 15 dB the suitable performance ofSCRS can be guarantied without any modification of the triphones speechmodels by: 1. the optimal setting of MFCC computation, 2. the propersilence model adaptation. The assumption of a speech commandrecognition system use in an environment where SNR is higher than 15 dBis fulfilled in many applications.www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	J. Uhlir P. Sovka J. Novotny
spellingShingle	J. Uhlir P. Sovka J. Novotny Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment Radioengineering
author_facet	J. Uhlir P. Sovka J. Novotny
author_sort	J. Uhlir
title	Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_short	Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_full	Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_fullStr	Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_full_unstemmed	Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
title_sort	analysis and optimization of telephone speech command recognition system performance in noisy environment
publisher	Spolecnost pro radioelektronicke inzenyrstvi
series	Radioengineering
issn	1210-2512
publishDate	2004-04-01
description	This paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coefficients analysis of speech (MFCC). The mainaim is to analyze and to search for the optimal settings of SCRS withrespect to additive noise robustness without use of additionaltechniques for additive noise reduction. The analysis is pointed to theappropriate setting of MFCC computation, the silence model adjustmentand grammar selection possibilities. It is shown, that the correctperformance of SCRS strictly depends on an appropriate adjustment ofthe silence model. The ability of the silence model adaptation isconfirmed. When SNR is higher than 15 dB the suitable performance ofSCRS can be guarantied without any modification of the triphones speechmodels by: 1. the optimal setting of MFCC computation, 2. the propersilence model adaptation. The assumption of a speech commandrecognition system use in an environment where SNR is higher than 15 dBis fulfilled in many applications.
url	http://www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf
work_keys_str_mv	AT juhlir analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment AT psovka analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment AT jnovotny analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment
_version_	1716777020723036160

Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment

Similar Items