Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to cont...

Full description

Bibliographic Details
Main Authors:	Akinori Ito, Yasutomo Kajiura, Motoyuki Suzuki, Shozo Makino
Format:	Article
Language:	English
Published:	SpringerOpen 2009-01-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Online Access:	http://dx.doi.org/10.1155/2009/140575

id	doaj-640035cbb4d145dd94a0900a8b1913c6
record_format	Article
spelling	doaj-640035cbb4d145dd94a0900a8b1913c62020-11-25T01:11:21ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47141687-47222009-01-01200910.1155/2009/140575Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech RecognitionAkinori ItoYasutomo KajiuraMotoyuki SuzukiShozo MakinoWe are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the “query relevance.” Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29%) was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%). http://dx.doi.org/10.1155/2009/140575
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Akinori Ito Yasutomo Kajiura Motoyuki Suzuki Shozo Makino
spellingShingle	Akinori Ito Yasutomo Kajiura Motoyuki Suzuki Shozo Makino Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition EURASIP Journal on Audio, Speech, and Music Processing
author_facet	Akinori Ito Yasutomo Kajiura Motoyuki Suzuki Shozo Makino
author_sort	Akinori Ito
title	Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
title_short	Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
title_full	Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
title_fullStr	Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
title_full_unstemmed	Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
title_sort	automatic query generation and query relevance measurement for unsupervised language model adaptation of speech recognition
publisher	SpringerOpen
series	EURASIP Journal on Audio, Speech, and Music Processing
issn	1687-4714 1687-4722
publishDate	2009-01-01
description	We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves Web documents using the chosen keywords. A problem is that the selected keywords tend to contain misrecognized words. The proposed method introduces two new ideas for avoiding the effects of keywords derived from misrecognized words. The first idea is to compose multiple queries from selected keyword candidates so that the misrecognized words and correct words do not fall into one query. The second idea is that the number of Web documents downloaded for each query is determined according to the “query relevance.” Combining these two ideas, we can alleviate bad effect of misrecognized keywords by decreasing the number of downloaded Web documents from queries that contain misrecognized keywords. Finally, we examine a method of determining the number of iterative adaptations based on the recognition likelihood. Experiments have shown that the proposed stopping criterion can determine almost the optimum number of iterations. In the final experiment, the word accuracy without adaptation (55.29%) was improved to 60.38%, which was 1.13 point better than the result of the conventional unsupervised adaptation method (59.25%).
url	http://dx.doi.org/10.1155/2009/140575
work_keys_str_mv	AT akinoriito automaticquerygenerationandqueryrelevancemeasurementforunsupervisedlanguagemodeladaptationofspeechrecognition AT yasutomokajiura automaticquerygenerationandqueryrelevancemeasurementforunsupervisedlanguagemodeladaptationofspeechrecognition AT motoyukisuzuki automaticquerygenerationandqueryrelevancemeasurementforunsupervisedlanguagemodeladaptationofspeechrecognition AT shozomakino automaticquerygenerationandqueryrelevancemeasurementforunsupervisedlanguagemodeladaptationofspeechrecognition
_version_	1725171546025099264

Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition

Similar Items