Automatic Identification and Indexing of Chinese Multilingual Spoken Messages

博士 === 國立交通大學 === 電信工程系 === 89 === This study focuses on two issues: dialect identification and spoken message indexing, which are necessary steps to design spoken language systems with the goal of multilingual information access. The first part of this st...

Full description

Bibliographic Details
Main Authors: Wei-Ho Tsai, 蔡偉和
Other Authors: Wen-Whei Chang
Format: Others
Language:en_US
Published: 2001
Online Access:http://ndltd.ncl.edu.tw/handle/40583607492665665191
id ndltd-TW-089NCTU0435095
record_format oai_dc
spelling ndltd-TW-089NCTU04350952016-01-29T04:28:15Z http://ndltd.ncl.edu.tw/handle/40583607492665665191 Automatic Identification and Indexing of Chinese Multilingual Spoken Messages 語言辨識與檢索在中文口語處理之研究 Wei-Ho Tsai 蔡偉和 博士 國立交通大學 電信工程系 89 This study focuses on two issues: dialect identification and spoken message indexing, which are necessary steps to design spoken language systems with the goal of multilingual information access. The first part of this study presents three approaches that employ varying degrees of linguistic traits to evaluate their relative contributions towards Chinese dialect identification. The first design approach was based on phonotactic analysis following phonetic tokenization, the second on pitch contour dynamics, and the third on a combination of segmental and prosodic features. The importance of incorporating prosodic information is due to the fact that Chinese syllables may have the same phonetic compositions, but different lexical meanings when spoken with different tones. Simulation results indicate that the proposed composite hidden Markov model is very effective in information integration, and use of this model can discriminate among three major Chinese dialects spoken in Taiwan with 89.3\% accuracy. Also proposed is a new stochastic model, Gaussian mixture bigram model (GMBM), that better characterizes the time correlation on acoustic feature frames. The main attraction of GMBMs arises from the fact that the observation used in dialect-specific modeling are extracted directly from the acoustic features; allowing us to estimate its model parameters without any transcription of training utterances. For greater efficiency, a minimum classification error algorithm is employed to accomplish discriminative training of a GMBM-based dialect identification system. The second part of this study addressed the general task of automatic indexing of spoken messages when no information is available regarding the language. This task was accomplished by partitioning the unlabeled speech messages into segments containing only one language and by grouping acoustically homogeneous segments into one-language clusters. Approaches to language-based segmentation are presented based on GMBM modeling of language acoustics in conjunction with different dissimilarity measurements. When dealing with the language clustering, the merits of using a new scheme based on vector clustering are explored as compared with conventional hierarchical clustering techniques. Wen-Whei Chang Sin-Horng Chen 張文輝 陳信宏 2001 學位論文 ; thesis 111 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立交通大學 === 電信工程系 === 89 === This study focuses on two issues: dialect identification and spoken message indexing, which are necessary steps to design spoken language systems with the goal of multilingual information access. The first part of this study presents three approaches that employ varying degrees of linguistic traits to evaluate their relative contributions towards Chinese dialect identification. The first design approach was based on phonotactic analysis following phonetic tokenization, the second on pitch contour dynamics, and the third on a combination of segmental and prosodic features. The importance of incorporating prosodic information is due to the fact that Chinese syllables may have the same phonetic compositions, but different lexical meanings when spoken with different tones. Simulation results indicate that the proposed composite hidden Markov model is very effective in information integration, and use of this model can discriminate among three major Chinese dialects spoken in Taiwan with 89.3\% accuracy. Also proposed is a new stochastic model, Gaussian mixture bigram model (GMBM), that better characterizes the time correlation on acoustic feature frames. The main attraction of GMBMs arises from the fact that the observation used in dialect-specific modeling are extracted directly from the acoustic features; allowing us to estimate its model parameters without any transcription of training utterances. For greater efficiency, a minimum classification error algorithm is employed to accomplish discriminative training of a GMBM-based dialect identification system. The second part of this study addressed the general task of automatic indexing of spoken messages when no information is available regarding the language. This task was accomplished by partitioning the unlabeled speech messages into segments containing only one language and by grouping acoustically homogeneous segments into one-language clusters. Approaches to language-based segmentation are presented based on GMBM modeling of language acoustics in conjunction with different dissimilarity measurements. When dealing with the language clustering, the merits of using a new scheme based on vector clustering are explored as compared with conventional hierarchical clustering techniques.
author2 Wen-Whei Chang
author_facet Wen-Whei Chang
Wei-Ho Tsai
蔡偉和
author Wei-Ho Tsai
蔡偉和
spellingShingle Wei-Ho Tsai
蔡偉和
Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
author_sort Wei-Ho Tsai
title Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
title_short Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
title_full Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
title_fullStr Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
title_full_unstemmed Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
title_sort automatic identification and indexing of chinese multilingual spoken messages
publishDate 2001
url http://ndltd.ncl.edu.tw/handle/40583607492665665191
work_keys_str_mv AT weihotsai automaticidentificationandindexingofchinesemultilingualspokenmessages
AT càiwěihé automaticidentificationandindexingofchinesemultilingualspokenmessages
AT weihotsai yǔyánbiànshíyǔjiǎnsuǒzàizhōngwénkǒuyǔchùlǐzhīyánjiū
AT càiwěihé yǔyánbiànshíyǔjiǎnsuǒzàizhōngwénkǒuyǔchùlǐzhīyánjiū
_version_ 1718171027270270976