A STUDY OF SPEAKER RECOGNITION
博士 === 國立清華大學 === 電機工程研究所 === 84 === This dissertation investigates the techniques in speaker recognition. They are the discriminative training algorithm, the scoring function and the acoustic segment based probabilistic model.Some new methods are derived...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
1996
|
Online Access: | http://ndltd.ncl.edu.tw/handle/79630296959684505857 |
id |
ndltd-TW-084NTHU0442003 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-084NTHU04420032016-07-13T04:10:35Z http://ndltd.ncl.edu.tw/handle/79630296959684505857 A STUDY OF SPEAKER RECOGNITION 語者識別上之研究 Liu, Chi-Shi 劉繼謚 博士 國立清華大學 電機工程研究所 84 This dissertation investigates the techniques in speaker recognition. They are the discriminative training algorithm, the scoring function and the acoustic segment based probabilistic model.Some new methods are derived for improving the accuracy rate. In discriminative training,we construct the hidden Markov model for each speaker.By taking into accountthe models of other competing speakers so that speaker separation is enhanced. The optimization solution can be obtained by using a probabilistic descent algorithm. In order to improve the performance of conventional scoring algorithm, we propose a new scoring method for speaker verification.This method is derived from the Bayes test for minimum risk to attain the objective of minimizing the error probability. The conventional frame based probabilistic models use the instantaneous spectral information of each individual frame. Instead, we proposed a new segment model which uses both instantaneous spectral information and previous spectral information. The fixed length segment model is a simple approach to get performance improvement. A more precise method is to consider the acoustic segments. In this method, speech signals are represented by the orthogonal polynomial function. An iterative algorithm is proposed to segment and model the speech signals. The segment boundaries can be automatically detected according to the characteristics of speech signals so that we can generate the segment based probabilistic models. A 100-speaker digit database collected through the public switching telephone network is used for a series of experiments to show the effectiveness of our proposed methods. Wang, Hsiao-Chuan 王小川 1996 學位論文 ; thesis 115 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立清華大學 === 電機工程研究所 === 84 === This dissertation investigates the techniques in speaker
recognition. They are the discriminative training algorithm,
the scoring function and the acoustic segment based
probabilistic model.Some new methods are derived for improving
the accuracy rate. In discriminative training,we construct the
hidden Markov model for each speaker.By taking into accountthe
models of other competing speakers so that speaker separation
is enhanced. The optimization solution can be obtained by using
a probabilistic descent algorithm. In order to improve the
performance of conventional scoring algorithm, we propose a new
scoring method for speaker verification.This method is derived
from the Bayes test for minimum risk to attain the objective of
minimizing the error probability. The conventional frame based
probabilistic models use the instantaneous spectral information
of each individual frame. Instead, we proposed a new segment
model which uses both instantaneous spectral information and
previous spectral information. The fixed length segment model
is a simple approach to get performance improvement. A more
precise method is to consider the acoustic segments. In this
method, speech signals are represented by the orthogonal
polynomial function. An iterative algorithm is proposed to
segment and model the speech signals. The segment boundaries
can be automatically detected according to the characteristics
of speech signals so that we can generate the segment based
probabilistic models. A 100-speaker digit database collected
through the public switching telephone network is used for a
series of experiments to show the effectiveness of our proposed
methods.
|
author2 |
Wang, Hsiao-Chuan |
author_facet |
Wang, Hsiao-Chuan Liu, Chi-Shi 劉繼謚 |
author |
Liu, Chi-Shi 劉繼謚 |
spellingShingle |
Liu, Chi-Shi 劉繼謚 A STUDY OF SPEAKER RECOGNITION |
author_sort |
Liu, Chi-Shi |
title |
A STUDY OF SPEAKER RECOGNITION |
title_short |
A STUDY OF SPEAKER RECOGNITION |
title_full |
A STUDY OF SPEAKER RECOGNITION |
title_fullStr |
A STUDY OF SPEAKER RECOGNITION |
title_full_unstemmed |
A STUDY OF SPEAKER RECOGNITION |
title_sort |
study of speaker recognition |
publishDate |
1996 |
url |
http://ndltd.ncl.edu.tw/handle/79630296959684505857 |
work_keys_str_mv |
AT liuchishi astudyofspeakerrecognition AT liújìshì astudyofspeakerrecognition AT liuchishi yǔzhěshíbiéshàngzhīyánjiū AT liújìshì yǔzhěshíbiéshàngzhīyánjiū AT liuchishi studyofspeakerrecognition AT liújìshì studyofspeakerrecognition |
_version_ |
1718345046324936704 |