An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition
The performance of automatic speech recognition systems often degrades in adverse conditions where there is a mismatch between training and testing conditions. This is true for most modern systems which employ Hidden Markov Models (HMMs) to decode speech utterances. One strategy is to map the distor...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
Georgia Institute of Technology
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/1853/48982 |
id |
ndltd-GATECH-oai-smartech.gatech.edu-1853-48982 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-GATECH-oai-smartech.gatech.edu-1853-489822016-05-12T03:48:43ZAn integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognitionMushtaq, AleemParticle filterHidden Markov modelRobust speech recognitionClusteringMarkov chain Monte CarloHidden Markov modelsSpeech perceptionMonte Carlo methodAlgorithmsThe performance of automatic speech recognition systems often degrades in adverse conditions where there is a mismatch between training and testing conditions. This is true for most modern systems which employ Hidden Markov Models (HMMs) to decode speech utterances. One strategy is to map the distorted features back to clean speech features that correspond well to the features used for training of HMMs. This can be achieved by treating the noisy speech as the distorted version of the clean speech of interest. Under this framework, we can track and consequently extract the underlying clean speech from the noisy signal and use this derived signal to perform utterance recognition. Particle filter is a versatile tracking technique that can be used where often conventional techniques such as Kalman filter fall short. We propose a particle filters based algorithm to compensate the corrupted features according to an additive noise model incorporating both the statistics from clean speech HMMs and observed background noise to map noisy features back to clean speech features. Instead of using specific knowledge at the model and state levels from HMMs which is hard to estimate, we pool model states into clusters as side information. Since each cluster encompasses more statistics when compared to the original HMM states, there is a higher possibility that the newly formed probability density function at the cluster level can cover the underlying speech variation to generate appropriate particle filter samples for feature compensation. Additionally, a dynamic joint tracking framework to monitor the clean speech signal and noise simultaneously is also introduced to obtain good noise statistics. In this approach, the information available from clean speech tracking can be effectively used for noise estimation. The availability of dynamic noise information can enhance the robustness of the algorithm in case of large fluctuations in noise parameters within an utterance. Testing the proposed PF-based compensation scheme on the Aurora 2 connected digit recognition task, we achieve an error reduction of 12.15% from the best multi-condition trained models using this integrated PF-HMM framework to estimate the cluster-based HMM state sequence information. Finally, we extended the PFC framework and evaluated it on a large-vocabulary recognition task, and showed that PFC works well for large-vocabulary systems also.Georgia Institute of TechnologyClements, Mark A.2013-09-19T12:19:03Z2013-09-19T12:19:03Z2013-082013-05-15August 20132013-09-19T12:19:03ZDissertationapplication/pdfhttp://hdl.handle.net/1853/48982en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
topic |
Particle filter Hidden Markov model Robust speech recognition Clustering Markov chain Monte Carlo Hidden Markov models Speech perception Monte Carlo method Algorithms |
spellingShingle |
Particle filter Hidden Markov model Robust speech recognition Clustering Markov chain Monte Carlo Hidden Markov models Speech perception Monte Carlo method Algorithms Mushtaq, Aleem An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition |
description |
The performance of automatic speech recognition systems often degrades in adverse conditions where there is a mismatch between training and testing conditions. This is true for most modern systems which employ Hidden Markov Models (HMMs) to decode speech utterances. One strategy is to map the distorted features back to clean speech features that correspond well to the features used for training of HMMs. This can be achieved by treating the noisy speech as the distorted version of the clean speech of interest. Under this framework, we can track and consequently extract the underlying clean speech from the noisy signal and use this derived signal to perform utterance recognition. Particle filter is a versatile tracking technique that can be used where often conventional techniques such as Kalman filter fall short. We propose a particle filters based algorithm to compensate the corrupted features according to an additive noise model incorporating both the statistics from clean speech HMMs and observed background noise to map noisy features back to clean speech features. Instead of using specific knowledge at the model and state levels from HMMs which is hard to estimate, we pool model states into clusters as side information. Since each cluster encompasses more statistics when compared to the original HMM states, there is a higher possibility that the newly formed probability density function at the cluster level can cover the underlying speech variation to generate appropriate particle filter samples for feature compensation. Additionally, a dynamic joint tracking framework to monitor the clean speech signal and noise simultaneously is also introduced to obtain good noise statistics. In this approach, the information available from clean speech tracking can be effectively used for noise estimation. The availability of dynamic noise information can enhance the robustness of the algorithm in case of large fluctuations in noise parameters within an utterance. Testing the proposed PF-based compensation scheme on the Aurora 2 connected digit recognition task, we achieve an error reduction of 12.15% from the best multi-condition trained models using this integrated PF-HMM framework to estimate the cluster-based HMM state sequence information. Finally, we extended the PFC framework and evaluated it on a large-vocabulary recognition task, and showed that PFC works well for large-vocabulary systems also. |
author2 |
Clements, Mark A. |
author_facet |
Clements, Mark A. Mushtaq, Aleem |
author |
Mushtaq, Aleem |
author_sort |
Mushtaq, Aleem |
title |
An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition |
title_short |
An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition |
title_full |
An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition |
title_fullStr |
An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition |
title_full_unstemmed |
An integrated approach to feature compensation combining particle filters and Hidden Markov Models for robust speech recognition |
title_sort |
integrated approach to feature compensation combining particle filters and hidden markov models for robust speech recognition |
publisher |
Georgia Institute of Technology |
publishDate |
2013 |
url |
http://hdl.handle.net/1853/48982 |
work_keys_str_mv |
AT mushtaqaleem anintegratedapproachtofeaturecompensationcombiningparticlefiltersandhiddenmarkovmodelsforrobustspeechrecognition AT mushtaqaleem integratedapproachtofeaturecompensationcombiningparticlefiltersandhiddenmarkovmodelsforrobustspeechrecognition |
_version_ |
1718265913530122240 |