Automatic Identification and Indexing of Chinese Multilingual Spoken Messages
博士 === 國立交通大學 === 電信工程系 === 89 === This study focuses on two issues: dialect identification and spoken message indexing, which are necessary steps to design spoken language systems with the goal of multilingual information access. The first part of this st...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2001
|
Online Access: | http://ndltd.ncl.edu.tw/handle/40583607492665665191 |
id |
ndltd-TW-089NCTU0435095 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-089NCTU04350952016-01-29T04:28:15Z http://ndltd.ncl.edu.tw/handle/40583607492665665191 Automatic Identification and Indexing of Chinese Multilingual Spoken Messages 語言辨識與檢索在中文口語處理之研究 Wei-Ho Tsai 蔡偉和 博士 國立交通大學 電信工程系 89 This study focuses on two issues: dialect identification and spoken message indexing, which are necessary steps to design spoken language systems with the goal of multilingual information access. The first part of this study presents three approaches that employ varying degrees of linguistic traits to evaluate their relative contributions towards Chinese dialect identification. The first design approach was based on phonotactic analysis following phonetic tokenization, the second on pitch contour dynamics, and the third on a combination of segmental and prosodic features. The importance of incorporating prosodic information is due to the fact that Chinese syllables may have the same phonetic compositions, but different lexical meanings when spoken with different tones. Simulation results indicate that the proposed composite hidden Markov model is very effective in information integration, and use of this model can discriminate among three major Chinese dialects spoken in Taiwan with 89.3\% accuracy. Also proposed is a new stochastic model, Gaussian mixture bigram model (GMBM), that better characterizes the time correlation on acoustic feature frames. The main attraction of GMBMs arises from the fact that the observation used in dialect-specific modeling are extracted directly from the acoustic features; allowing us to estimate its model parameters without any transcription of training utterances. For greater efficiency, a minimum classification error algorithm is employed to accomplish discriminative training of a GMBM-based dialect identification system. The second part of this study addressed the general task of automatic indexing of spoken messages when no information is available regarding the language. This task was accomplished by partitioning the unlabeled speech messages into segments containing only one language and by grouping acoustically homogeneous segments into one-language clusters. Approaches to language-based segmentation are presented based on GMBM modeling of language acoustics in conjunction with different dissimilarity measurements. When dealing with the language clustering, the merits of using a new scheme based on vector clustering are explored as compared with conventional hierarchical clustering techniques. Wen-Whei Chang Sin-Horng Chen 張文輝 陳信宏 2001 學位論文 ; thesis 111 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立交通大學 === 電信工程系 === 89 === This study focuses on two issues: dialect identification and spoken message
indexing, which are necessary steps to design spoken language systems with
the goal of multilingual information access.
The first part of this study presents three approaches that employ varying
degrees of linguistic traits to evaluate their relative contributions towards Chinese
dialect identification. The first design approach was based on phonotactic
analysis following
phonetic tokenization, the second on pitch contour dynamics, and the third on a
combination of segmental and prosodic features.
The importance of incorporating prosodic information is due to the fact that
Chinese syllables may have the same phonetic compositions, but
different lexical meanings when spoken with different tones.
Simulation results indicate that the proposed composite hidden Markov model is
very effective in information integration,
and use of this model can discriminate
among three major Chinese dialects spoken in Taiwan with 89.3\% accuracy.
Also proposed is a new stochastic model, Gaussian mixture bigram model (GMBM),
that better characterizes the time correlation on acoustic feature frames.
The main attraction of GMBMs arises from the fact
that the observation used in dialect-specific modeling are extracted directly
from the acoustic features; allowing us to estimate its model parameters
without any transcription of training utterances. For greater efficiency, a
minimum classification error algorithm is employed to accomplish discriminative
training of a GMBM-based dialect identification system.
The second part of this study addressed the general task of automatic indexing
of spoken messages when no information is available regarding the language.
This task was accomplished by partitioning the unlabeled speech messages into
segments containing only one language and by grouping acoustically homogeneous
segments into one-language clusters.
Approaches to language-based segmentation are presented based on GMBM modeling
of language acoustics in conjunction with different dissimilarity measurements.
When dealing with the language clustering, the merits of using a new scheme based on
vector clustering
are explored as compared with conventional hierarchical clustering techniques.
|
author2 |
Wen-Whei Chang |
author_facet |
Wen-Whei Chang Wei-Ho Tsai 蔡偉和 |
author |
Wei-Ho Tsai 蔡偉和 |
spellingShingle |
Wei-Ho Tsai 蔡偉和 Automatic Identification and Indexing of Chinese Multilingual Spoken Messages |
author_sort |
Wei-Ho Tsai |
title |
Automatic Identification and Indexing of Chinese Multilingual Spoken Messages |
title_short |
Automatic Identification and Indexing of Chinese Multilingual Spoken Messages |
title_full |
Automatic Identification and Indexing of Chinese Multilingual Spoken Messages |
title_fullStr |
Automatic Identification and Indexing of Chinese Multilingual Spoken Messages |
title_full_unstemmed |
Automatic Identification and Indexing of Chinese Multilingual Spoken Messages |
title_sort |
automatic identification and indexing of chinese multilingual spoken messages |
publishDate |
2001 |
url |
http://ndltd.ncl.edu.tw/handle/40583607492665665191 |
work_keys_str_mv |
AT weihotsai automaticidentificationandindexingofchinesemultilingualspokenmessages AT càiwěihé automaticidentificationandindexingofchinesemultilingualspokenmessages AT weihotsai yǔyánbiànshíyǔjiǎnsuǒzàizhōngwénkǒuyǔchùlǐzhīyánjiū AT càiwěihé yǔyánbiànshíyǔjiǎnsuǒzàizhōngwénkǒuyǔchùlǐzhīyánjiū |
_version_ |
1718171027270270976 |