An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition

The performance of automatic speech recognition systems often degrades in adverse conditions where there is a mismatch between training and testing conditions. This is true for most modern systems which employ Hidden Markov Models (HMMs) to decode speech utterances. One strategy is to map the distor...

Full description

Bibliographic Details
Main Author: Mushtaq, Aleem
Other Authors: Clements, Mark A.
Format: Others
Language:en_US
Published: Georgia Institute of Technology 2013
Subjects:
Online Access:http://hdl.handle.net/1853/48982
id ndltd-GATECH-oai-smartech.gatech.edu-1853-48982
record_format oai_dc
spelling ndltd-GATECH-oai-smartech.gatech.edu-1853-489822016-05-12T03:48:43ZAn integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognitionMushtaq, AleemParticle filterHidden Markov modelRobust speech recognitionClusteringMarkov chain Monte CarloHidden Markov modelsSpeech perceptionMonte Carlo methodAlgorithmsThe performance of automatic speech recognition systems often degrades in adverse conditions where there is a mismatch between training and testing conditions. This is true for most modern systems which employ Hidden Markov Models (HMMs) to decode speech utterances. One strategy is to map the distorted features back to clean speech features that correspond well to the features used for training of HMMs. This can be achieved by treating the noisy speech as the distorted version of the clean speech of interest. Under this framework, we can track and consequently extract the underlying clean speech from the noisy signal and use this derived signal to perform utterance recognition. Particle filter is a versatile tracking technique that can be used where often conventional techniques such as Kalman filter fall short. We propose a particle filters based algorithm to compensate the corrupted features according to an additive noise model incorporating both the statistics from clean speech HMMs and observed background noise to map noisy features back to clean speech features. Instead of using specific knowledge at the model and state levels from HMMs which is hard to estimate, we pool model states into clusters as side information. Since each cluster encompasses more statistics when compared to the original HMM states, there is a higher possibility that the newly formed probability density function at the cluster level can cover the underlying speech variation to generate appropriate particle filter samples for feature compensation. Additionally, a dynamic joint tracking framework to monitor the clean speech signal and noise simultaneously is also introduced to obtain good noise statistics. In this approach, the information available from clean speech tracking can be effectively used for noise estimation. The availability of dynamic noise information can enhance the robustness of the algorithm in case of large fluctuations in noise parameters within an utterance. Testing the proposed PF-based compensation scheme on the Aurora 2 connected digit recognition task, we achieve an error reduction of 12.15% from the best multi-condition trained models using this integrated PF-HMM framework to estimate the cluster-based HMM state sequence information. Finally, we extended the PFC framework and evaluated it on a large-vocabulary recognition task, and showed that PFC works well for large-vocabulary systems also.Georgia Institute of TechnologyClements, Mark A.2013-09-19T12:19:03Z2013-09-19T12:19:03Z2013-082013-05-15August 20132013-09-19T12:19:03ZDissertationapplication/pdfhttp://hdl.handle.net/1853/48982en_US
collection NDLTD
language en_US
format Others
sources NDLTD
topic Particle filter
Hidden Markov model
Robust speech recognition
Clustering
Markov chain Monte Carlo
Hidden Markov models
Speech perception
Monte Carlo method
Algorithms
spellingShingle Particle filter
Hidden Markov model
Robust speech recognition
Clustering
Markov chain Monte Carlo
Hidden Markov models
Speech perception
Monte Carlo method
Algorithms
Mushtaq, Aleem
An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
description The performance of automatic speech recognition systems often degrades in adverse conditions where there is a mismatch between training and testing conditions. This is true for most modern systems which employ Hidden Markov Models (HMMs) to decode speech utterances. One strategy is to map the distorted features back to clean speech features that correspond well to the features used for training of HMMs. This can be achieved by treating the noisy speech as the distorted version of the clean speech of interest. Under this framework, we can track and consequently extract the underlying clean speech from the noisy signal and use this derived signal to perform utterance recognition. Particle filter is a versatile tracking technique that can be used where often conventional techniques such as Kalman filter fall short. We propose a particle filters based algorithm to compensate the corrupted features according to an additive noise model incorporating both the statistics from clean speech HMMs and observed background noise to map noisy features back to clean speech features. Instead of using specific knowledge at the model and state levels from HMMs which is hard to estimate, we pool model states into clusters as side information. Since each cluster encompasses more statistics when compared to the original HMM states, there is a higher possibility that the newly formed probability density function at the cluster level can cover the underlying speech variation to generate appropriate particle filter samples for feature compensation. Additionally, a dynamic joint tracking framework to monitor the clean speech signal and noise simultaneously is also introduced to obtain good noise statistics. In this approach, the information available from clean speech tracking can be effectively used for noise estimation. The availability of dynamic noise information can enhance the robustness of the algorithm in case of large fluctuations in noise parameters within an utterance. Testing the proposed PF-based compensation scheme on the Aurora 2 connected digit recognition task, we achieve an error reduction of 12.15% from the best multi-condition trained models using this integrated PF-HMM framework to estimate the cluster-based HMM state sequence information. Finally, we extended the PFC framework and evaluated it on a large-vocabulary recognition task, and showed that PFC works well for large-vocabulary systems also.
author2 Clements, Mark A.
author_facet Clements, Mark A.
Mushtaq, Aleem
author Mushtaq, Aleem
author_sort Mushtaq, Aleem
title An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
title_short An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
title_full An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
title_fullStr An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
title_full_unstemmed An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
title_sort integrated approach to feature compensation combining particle filters and hidden markov models for robust speech recognition
publisher Georgia Institute of Technology
publishDate 2013
url http://hdl.handle.net/1853/48982
work_keys_str_mv AT mushtaqaleem anintegratedapproachtofeaturecompensationcombiningparticlefiltersandhiddenmarkovmodelsforrobustspeechrecognition
AT mushtaqaleem integratedapproachtofeaturecompensationcombiningparticlefiltersandhiddenmarkovmodelsforrobustspeechrecognition
_version_ 1718265913530122240