A Study of Low Bit Rate Speech Codec with Speaker Recognizability

碩士 === 國立成功大學 === 資訊工程研究所 === 85 === 　　In the past, low bit rate speech coders were mostly aimed at intelligibility and quality. The approaches for these works may result in lower speaker recognizability. In this paper, we present a low bit rate speech coder with better speaker recognizability using...

Full description

Bibliographic Details
Main Authors:	Tsai, Jia-Ching, 蔡佳青
Other Authors:	Wu, Chung-Hsien
Format:	Others
Language:	zh-TW
Published:	1997
Online Access:	http://ndltd.ncl.edu.tw/handle/39322223777486191523

id	ndltd-TW-085NCKU3392016
record_format	oai_dc
spelling	ndltd-TW-085NCKU33920162015-10-13T12:18:06Z http://ndltd.ncl.edu.tw/handle/39322223777486191523 A Study of Low Bit Rate Speech Codec with Speaker Recognizability 具語者特徵之低位元率語音編碼器之研究 Tsai, Jia-Ching 蔡佳青碩士國立成功大學資訊工程研究所 85 　　In the past, low bit rate speech coders were mostly aimed at intelligibility and quality. The approaches for these works may result in lower speaker recognizability. In this paper, we present a low bit rate speech coder with better speaker recognizability using the selection of glottal excitation for a sopecific speaker. 　　The parameters affecting speaker recognizability are pitch, linear prediction coefficients, and glottal excitation. Most low bit rate speech coders focused on finding a good and compact representaiton of glottal excitation. In order to suitably represent the glottal excitation, the excitation pulse determination algorithm used in Multi - Pulse Excited LPC is adopted. In this paper, 25 periodic pulses are determined for a voiced frame. A period of speaker - specific excitation pattern with only 3 pulses, one primary and two secondary pulses, is chosen from the 25 pulses using a proposed pattern selection method. This 3 - pulse pattern is used to represent the excitation of the voiced speech pronounced by the speaker and sent to the receiver. In the receiver, the 3 - pulse pattern is smoothed using an FIR low pass filter in order to obtain a more smooth and continuous pattern. For voiced speech, this smoothed pattern is used to synthesize speech signals via LPC model. For unvoiced speech, random white noise is adopted as the excitation pattern. 　　The proposed approach has been implemented on a Pentium / 133 PC in Windows 95 and is running in real - time performance. The coder has MOS 2.5 while LPC - 10e has only 2.24. Speaker recobnizagility in this coder is also much more than that in traditional coders. Wu, Chung-Hsien 吳宗憲 1997 學位論文 ; thesis 40 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立成功大學 === 資訊工程研究所 === 85 === 　　In the past, low bit rate speech coders were mostly aimed at intelligibility and quality. The approaches for these works may result in lower speaker recognizability. In this paper, we present a low bit rate speech coder with better speaker recognizability using the selection of glottal excitation for a sopecific speaker. 　　The parameters affecting speaker recognizability are pitch, linear prediction coefficients, and glottal excitation. Most low bit rate speech coders focused on finding a good and compact representaiton of glottal excitation. In order to suitably represent the glottal excitation, the excitation pulse determination algorithm used in Multi - Pulse Excited LPC is adopted. In this paper, 25 periodic pulses are determined for a voiced frame. A period of speaker - specific excitation pattern with only 3 pulses, one primary and two secondary pulses, is chosen from the 25 pulses using a proposed pattern selection method. This 3 - pulse pattern is used to represent the excitation of the voiced speech pronounced by the speaker and sent to the receiver. In the receiver, the 3 - pulse pattern is smoothed using an FIR low pass filter in order to obtain a more smooth and continuous pattern. For voiced speech, this smoothed pattern is used to synthesize speech signals via LPC model. For unvoiced speech, random white noise is adopted as the excitation pattern. 　　The proposed approach has been implemented on a Pentium / 133 PC in Windows 95 and is running in real - time performance. The coder has MOS 2.5 while LPC - 10e has only 2.24. Speaker recobnizagility in this coder is also much more than that in traditional coders.
author2	Wu, Chung-Hsien
author_facet	Wu, Chung-Hsien Tsai, Jia-Ching 蔡佳青
author	Tsai, Jia-Ching 蔡佳青
spellingShingle	Tsai, Jia-Ching 蔡佳青 A Study of Low Bit Rate Speech Codec with Speaker Recognizability
author_sort	Tsai, Jia-Ching
title	A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_short	A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_full	A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_fullStr	A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_full_unstemmed	A Study of Low Bit Rate Speech Codec with Speaker Recognizability
title_sort	study of low bit rate speech codec with speaker recognizability
publishDate	1997
url	http://ndltd.ncl.edu.tw/handle/39322223777486191523
work_keys_str_mv	AT tsaijiaching astudyoflowbitratespeechcodecwithspeakerrecognizability AT càijiāqīng astudyoflowbitratespeechcodecwithspeakerrecognizability AT tsaijiaching jùyǔzhětèzhēngzhīdīwèiyuánlǜyǔyīnbiānmǎqìzhīyánjiū AT càijiāqīng jùyǔzhětèzhēngzhīdīwèiyuánlǜyǔyīnbiānmǎqìzhīyánjiū AT tsaijiaching studyoflowbitratespeechcodecwithspeakerrecognizability AT càijiāqīng studyoflowbitratespeechcodecwithspeakerrecognizability
_version_	1716857716731805696

A Study of Low Bit Rate Speech Codec with Speaker Recognizability

Similar Items