A STUDY OF SPEAKER RECOGNITION

博士 === 國立清華大學 === 電機工程研究所 === 84 === This dissertation investigates the techniques in speaker recognition. They are the discriminative training algorithm, the scoring function and the acoustic segment based probabilistic model.Some new methods are derived...

Full description

Bibliographic Details
Main Authors: Liu, Chi-Shi, 劉繼謚
Other Authors: Wang, Hsiao-Chuan
Format: Others
Language:zh-TW
Published: 1996
Online Access:http://ndltd.ncl.edu.tw/handle/79630296959684505857
id ndltd-TW-084NTHU0442003
record_format oai_dc
spelling ndltd-TW-084NTHU04420032016-07-13T04:10:35Z http://ndltd.ncl.edu.tw/handle/79630296959684505857 A STUDY OF SPEAKER RECOGNITION 語者識別上之研究 Liu, Chi-Shi 劉繼謚 博士 國立清華大學 電機工程研究所 84 This dissertation investigates the techniques in speaker recognition. They are the discriminative training algorithm, the scoring function and the acoustic segment based probabilistic model.Some new methods are derived for improving the accuracy rate. In discriminative training,we construct the hidden Markov model for each speaker.By taking into accountthe models of other competing speakers so that speaker separation is enhanced. The optimization solution can be obtained by using a probabilistic descent algorithm. In order to improve the performance of conventional scoring algorithm, we propose a new scoring method for speaker verification.This method is derived from the Bayes test for minimum risk to attain the objective of minimizing the error probability. The conventional frame based probabilistic models use the instantaneous spectral information of each individual frame. Instead, we proposed a new segment model which uses both instantaneous spectral information and previous spectral information. The fixed length segment model is a simple approach to get performance improvement. A more precise method is to consider the acoustic segments. In this method, speech signals are represented by the orthogonal polynomial function. An iterative algorithm is proposed to segment and model the speech signals. The segment boundaries can be automatically detected according to the characteristics of speech signals so that we can generate the segment based probabilistic models. A 100-speaker digit database collected through the public switching telephone network is used for a series of experiments to show the effectiveness of our proposed methods. Wang, Hsiao-Chuan 王小川 1996 學位論文 ; thesis 115 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 博士 === 國立清華大學 === 電機工程研究所 === 84 === This dissertation investigates the techniques in speaker recognition. They are the discriminative training algorithm, the scoring function and the acoustic segment based probabilistic model.Some new methods are derived for improving the accuracy rate. In discriminative training,we construct the hidden Markov model for each speaker.By taking into accountthe models of other competing speakers so that speaker separation is enhanced. The optimization solution can be obtained by using a probabilistic descent algorithm. In order to improve the performance of conventional scoring algorithm, we propose a new scoring method for speaker verification.This method is derived from the Bayes test for minimum risk to attain the objective of minimizing the error probability. The conventional frame based probabilistic models use the instantaneous spectral information of each individual frame. Instead, we proposed a new segment model which uses both instantaneous spectral information and previous spectral information. The fixed length segment model is a simple approach to get performance improvement. A more precise method is to consider the acoustic segments. In this method, speech signals are represented by the orthogonal polynomial function. An iterative algorithm is proposed to segment and model the speech signals. The segment boundaries can be automatically detected according to the characteristics of speech signals so that we can generate the segment based probabilistic models. A 100-speaker digit database collected through the public switching telephone network is used for a series of experiments to show the effectiveness of our proposed methods.
author2 Wang, Hsiao-Chuan
author_facet Wang, Hsiao-Chuan
Liu, Chi-Shi
劉繼謚
author Liu, Chi-Shi
劉繼謚
spellingShingle Liu, Chi-Shi
劉繼謚
A STUDY OF SPEAKER RECOGNITION
author_sort Liu, Chi-Shi
title A STUDY OF SPEAKER RECOGNITION
title_short A STUDY OF SPEAKER RECOGNITION
title_full A STUDY OF SPEAKER RECOGNITION
title_fullStr A STUDY OF SPEAKER RECOGNITION
title_full_unstemmed A STUDY OF SPEAKER RECOGNITION
title_sort study of speaker recognition
publishDate 1996
url http://ndltd.ncl.edu.tw/handle/79630296959684505857
work_keys_str_mv AT liuchishi astudyofspeakerrecognition
AT liújìshì astudyofspeakerrecognition
AT liuchishi yǔzhěshíbiéshàngzhīyánjiū
AT liújìshì yǔzhěshíbiéshàngzhīyánjiū
AT liuchishi studyofspeakerrecognition
AT liújìshì studyofspeakerrecognition
_version_ 1718345046324936704