A Study of Low Bit Rate Speech Codec with Speaker Recognizability

碩士 === 國立成功大學 === 資訊工程研究所 === 85 ===   In the past, low bit rate speech coders were mostly aimed at intelligibility and quality. The approaches for these works may result in lower speaker recognizability. In this paper, we present a low bit rate speech coder with better speaker recognizability using...

Full description

Bibliographic Details
Main Authors: Tsai, Jia-Ching, 蔡佳青
Other Authors: Wu, Chung-Hsien
Format: Others
Language:zh-TW
Published: 1997
Online Access:http://ndltd.ncl.edu.tw/handle/39322223777486191523
id ndltd-TW-085NCKU3392016
record_format oai_dc
spelling ndltd-TW-085NCKU33920162015-10-13T12:18:06Z http://ndltd.ncl.edu.tw/handle/39322223777486191523 A Study of Low Bit Rate Speech Codec with Speaker Recognizability 具語者特徵之低位元率語音編碼器之研究 Tsai, Jia-Ching 蔡佳青 碩士 國立成功大學 資訊工程研究所 85   In the past, low bit rate speech coders were mostly aimed at intelligibility and quality. The approaches for these works may result in lower speaker recognizability. In this paper, we present a low bit rate speech coder with better speaker recognizability using the selection of glottal excitation for a sopecific speaker.   The parameters affecting speaker recognizability are pitch, linear prediction coefficients, and glottal excitation. Most low bit rate speech coders focused on finding a good and compact representaiton of glottal excitation. In order to suitably represent the glottal excitation, the excitation pulse determination algorithm used in Multi - Pulse Excited LPC is adopted. In this paper, 25 periodic pulses are determined for a voiced frame. A period of speaker - specific excitation pattern with only 3 pulses, one primary and two secondary pulses, is chosen from the 25 pulses using a proposed pattern selection method. This 3 - pulse pattern is used to represent the excitation of the voiced speech pronounced by the speaker and sent to the receiver. In the receiver, the 3 - pulse pattern is smoothed using an FIR low pass filter in order to obtain a more smooth and continuous pattern. For voiced speech, this smoothed pattern is used to synthesize speech signals via LPC model. For unvoiced speech, random white noise is adopted as the excitation pattern.   The proposed approach has been implemented on a Pentium / 133 PC in Windows 95 and is running in real - time performance. The coder has MOS 2.5 while LPC - 10e has only 2.24. Speaker recobnizagility in this coder is also much more than that in traditional coders. Wu, Chung-Hsien 吳宗憲 1997 學位論文 ; thesis 40 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 資訊工程研究所 === 85 ===   In the past, low bit rate speech coders were mostly aimed at intelligibility and quality. The approaches for these works may result in lower speaker recognizability. In this paper, we present a low bit rate speech coder with better speaker recognizability using the selection of glottal excitation for a sopecific speaker.   The parameters affecting speaker recognizability are pitch, linear prediction coefficients, and glottal excitation. Most low bit rate speech coders focused on finding a good and compact representaiton of glottal excitation. In order to suitably represent the glottal excitation, the excitation pulse determination algorithm used in Multi - Pulse Excited LPC is adopted. In this paper, 25 periodic pulses are determined for a voiced frame. A period of speaker - specific excitation pattern with only 3 pulses, one primary and two secondary pulses, is chosen from the 25 pulses using a proposed pattern selection method. This 3 - pulse pattern is used to represent the excitation of the voiced speech pronounced by the speaker and sent to the receiver. In the receiver, the 3 - pulse pattern is smoothed using an FIR low pass filter in order to obtain a more smooth and continuous pattern. For voiced speech, this smoothed pattern is used to synthesize speech signals via LPC model. For unvoiced speech, random white noise is adopted as the excitation pattern.   The proposed approach has been implemented on a Pentium / 133 PC in Windows 95 and is running in real - time performance. The coder has MOS 2.5 while LPC - 10e has only 2.24. Speaker recobnizagility in this coder is also much more than that in traditional coders.
author2 Wu, Chung-Hsien
author_facet Wu, Chung-Hsien
Tsai, Jia-Ching
蔡佳青
author Tsai, Jia-Ching
蔡佳青
spellingShingle Tsai, Jia-Ching
蔡佳青
A Study of Low Bit Rate Speech Codec with Speaker Recognizability
author_sort Tsai, Jia-Ching
title A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_short A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_full A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_fullStr A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_full_unstemmed A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_sort study of low bit rate speech codec with speaker recognizability
publishDate 1997
url http://ndltd.ncl.edu.tw/handle/39322223777486191523
work_keys_str_mv AT tsaijiaching astudyoflowbitratespeechcodecwithspeakerrecognizability
AT càijiāqīng astudyoflowbitratespeechcodecwithspeakerrecognizability
AT tsaijiaching jùyǔzhětèzhēngzhīdīwèiyuánlǜyǔyīnbiānmǎqìzhīyánjiū
AT càijiāqīng jùyǔzhětèzhēngzhīdīwèiyuánlǜyǔyīnbiānmǎqìzhīyánjiū
AT tsaijiaching studyoflowbitratespeechcodecwithspeakerrecognizability
AT càijiāqīng studyoflowbitratespeechcodecwithspeakerrecognizability
_version_ 1716857716731805696