Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment
This paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coeffic...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Spolecnost pro radioelektronicke inzenyrstvi
2004-04-01
|
Series: | Radioengineering |
Online Access: | http://www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf |
id |
doaj-2b53b99a83444a95b03c1475133db3ab |
---|---|
record_format |
Article |
spelling |
doaj-2b53b99a83444a95b03c1475133db3ab2020-11-24T21:01:59ZengSpolecnost pro radioelektronicke inzenyrstviRadioengineering1210-25122004-04-0113117Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy EnvironmentJ. UhlirP. SovkaJ. NovotnyThis paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coefficients analysis of speech (MFCC). The mainaim is to analyze and to search for the optimal settings of SCRS withrespect to additive noise robustness without use of additionaltechniques for additive noise reduction. The analysis is pointed to theappropriate setting of MFCC computation, the silence model adjustmentand grammar selection possibilities. It is shown, that the correctperformance of SCRS strictly depends on an appropriate adjustment ofthe silence model. The ability of the silence model adaptation isconfirmed. When SNR is higher than 15 dB the suitable performance ofSCRS can be guarantied without any modification of the triphones speechmodels by: 1. the optimal setting of MFCC computation, 2. the propersilence model adaptation. The assumption of a speech commandrecognition system use in an environment where SNR is higher than 15 dBis fulfilled in many applications.www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
J. Uhlir P. Sovka J. Novotny |
spellingShingle |
J. Uhlir P. Sovka J. Novotny Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment Radioengineering |
author_facet |
J. Uhlir P. Sovka J. Novotny |
author_sort |
J. Uhlir |
title |
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment |
title_short |
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment |
title_full |
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment |
title_fullStr |
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment |
title_full_unstemmed |
Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment |
title_sort |
analysis and optimization of telephone speech command recognition system performance in noisy environment |
publisher |
Spolecnost pro radioelektronicke inzenyrstvi |
series |
Radioengineering |
issn |
1210-2512 |
publishDate |
2004-04-01 |
description |
This paper deals with the analysis and optimization of a speechcommand recognition system (SCRS) trained on Czech telephone databaseSpeechdat(E) for use in a selected noisy environment. The SCRS is basedon hidden Markov models of context dependent phones (triphones) andmel-frequency cepstral coefficients analysis of speech (MFCC). The mainaim is to analyze and to search for the optimal settings of SCRS withrespect to additive noise robustness without use of additionaltechniques for additive noise reduction. The analysis is pointed to theappropriate setting of MFCC computation, the silence model adjustmentand grammar selection possibilities. It is shown, that the correctperformance of SCRS strictly depends on an appropriate adjustment ofthe silence model. The ability of the silence model adaptation isconfirmed. When SNR is higher than 15 dB the suitable performance ofSCRS can be guarantied without any modification of the triphones speechmodels by: 1. the optimal setting of MFCC computation, 2. the propersilence model adaptation. The assumption of a speech commandrecognition system use in an environment where SNR is higher than 15 dBis fulfilled in many applications. |
url |
http://www.radioeng.cz/fulltexts/2004/04_01_01_07.pdf |
work_keys_str_mv |
AT juhlir analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment AT psovka analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment AT jnovotny analysisandoptimizationoftelephonespeechcommandrecognitionsystemperformanceinnoisyenvironment |
_version_ |
1716777020723036160 |