A VARIABLE BIT RATE MELP SPEECH CODER
碩士 === 大同大學 === 通訊工程研究所 === 90 === In the era of third-generation (3G) wireless personal communications, though applications of multimedia such as video and data communication have become more and more popular, speech communication is still one of the most important mobile radio services....
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2002
|
Online Access: | http://ndltd.ncl.edu.tw/handle/48041242170593651845 |
id |
ndltd-TW-090TTU00650012 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-090TTU006500122016-06-24T04:15:11Z http://ndltd.ncl.edu.tw/handle/48041242170593651845 A VARIABLE BIT RATE MELP SPEECH CODER 可變位元率MELP語音編碼器 Kai-Cheng Lien 廉凱成 碩士 大同大學 通訊工程研究所 90 In the era of third-generation (3G) wireless personal communications, though applications of multimedia such as video and data communication have become more and more popular, speech communication is still one of the most important mobile radio services. Consequently, speech coding techniques that can compress speech information into as few parameters as possible is increasingly important. To achieve the goal, the use of variable bit rate speech coders is certainly an attractive approach to retain overall high voice quality at low average bit rate. The aim of this thesis is thus to develop a variable-rate speech coder based on the Federal Standard 1017 (FS-1017), a 2.4 kbps mixed excitation linear prediction (MELP) coder originally developed by the Texas Instruments and then standardized by the U.S. Department of Defense. In the FS-1017 standard 2.4 kbps MELP coder, sampling rate is 8 kHz with 16 bit resolution and the frame size is 22.5 ms. Each frame has a bit stream of 54 bits, wherein 25 bits are LPC coefficients, which account for 46% of the required bandwidth. Therefore effectively quantization of the LPC coefficients is essential to reduce overall bit rate for the MELP coder. In the FS-1017 MELP coder, LPC parameters are transformed to line spectral frequencies (LSFs) and then quantized by a fixed four-stage vector quantizer. However, our experimental results showed that, with only one- to three-stage VQ, more than 20% quantized LSFs could satisfy the requirement of “transparent quantization,” i.e., having an average spectral distortion (SD) less than 1 dB. Accordingly, we proposed to utilize a variable-stage vector quantizer (VSVQ) to design a variable-rate MELP speech coder. Specifically, we insert an experimentally determined threshold after each stage of the VSVQ to determine whether the SD requirement is satisfied. When the answer is yes, the procedure of the VSVQ is stop to save the bits for the following VQ stages. Our experimental results showed that the speech quality of the proposed variable-rate MELP coder is very close to that of FS-1017 standard 2.4 kbps MELP coder. When the average bit rate is around 2.1 kbps, there is no audible difference between the FS-1017 and the proposed variable-rate MELP coders. The experimental results also showed that the Signal-to-Difference Ratio (SDR) between the synthetic speech of the FS-1017 and that of the proposed variable-rate MELP coder is as high as 85 dB. The structure of the variable-stage vector quantizer we proposed in this research has been proved to be a success in MELP coder. We believe that it also has high potential to be used in other types of speech coders to extend their usage. Ching-Kuen Lee 李清坤 2002 學位論文 ; thesis 66 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 大同大學 === 通訊工程研究所 === 90 === In the era of third-generation (3G) wireless personal communications, though applications of multimedia such as video and data communication have become more and more popular, speech communication is still one of the most important mobile radio services. Consequently, speech coding techniques that can compress speech information into as few parameters as possible is increasingly important. To achieve the goal, the use of variable bit rate speech coders is certainly an attractive approach to retain overall high voice quality at low average bit rate. The aim of this thesis is thus to develop a variable-rate speech coder based on the Federal Standard 1017 (FS-1017), a 2.4 kbps mixed excitation linear prediction (MELP) coder originally developed by the Texas Instruments and then standardized by the U.S. Department of Defense.
In the FS-1017 standard 2.4 kbps MELP coder, sampling rate is 8 kHz with 16 bit resolution and the frame size is 22.5 ms. Each frame has a bit stream of 54 bits, wherein 25 bits are LPC coefficients, which account for 46% of the required bandwidth. Therefore effectively quantization of the LPC coefficients is essential to reduce overall bit rate for the MELP coder.
In the FS-1017 MELP coder, LPC parameters are transformed to line spectral frequencies (LSFs) and then quantized by a fixed four-stage vector quantizer. However, our experimental results showed that, with only one- to three-stage VQ, more than 20% quantized LSFs could satisfy the requirement of “transparent quantization,” i.e., having an average spectral distortion (SD) less than 1 dB. Accordingly, we proposed to utilize a variable-stage vector quantizer (VSVQ) to design a variable-rate MELP speech coder. Specifically, we insert an experimentally determined threshold after each stage of the VSVQ to determine whether the SD requirement is satisfied. When the answer is yes, the procedure of the VSVQ is stop to save the bits for the following VQ stages.
Our experimental results showed that the speech quality of the proposed variable-rate MELP coder is very close to that of FS-1017 standard 2.4 kbps MELP coder. When the average bit rate is around 2.1 kbps, there is no audible difference between the FS-1017 and the proposed variable-rate MELP coders. The experimental results also showed that the Signal-to-Difference Ratio (SDR) between the synthetic speech of the FS-1017 and that of the proposed variable-rate MELP coder is as high as 85 dB. The structure of the variable-stage vector quantizer we proposed in this research has been proved to be a success in MELP coder. We believe that it also has high potential to be used in other types of speech coders to extend their usage.
|
author2 |
Ching-Kuen Lee |
author_facet |
Ching-Kuen Lee Kai-Cheng Lien 廉凱成 |
author |
Kai-Cheng Lien 廉凱成 |
spellingShingle |
Kai-Cheng Lien 廉凱成 A VARIABLE BIT RATE MELP SPEECH CODER |
author_sort |
Kai-Cheng Lien |
title |
A VARIABLE BIT RATE MELP SPEECH CODER |
title_short |
A VARIABLE BIT RATE MELP SPEECH CODER |
title_full |
A VARIABLE BIT RATE MELP SPEECH CODER |
title_fullStr |
A VARIABLE BIT RATE MELP SPEECH CODER |
title_full_unstemmed |
A VARIABLE BIT RATE MELP SPEECH CODER |
title_sort |
variable bit rate melp speech coder |
publishDate |
2002 |
url |
http://ndltd.ncl.edu.tw/handle/48041242170593651845 |
work_keys_str_mv |
AT kaichenglien avariablebitratemelpspeechcoder AT liánkǎichéng avariablebitratemelpspeechcoder AT kaichenglien kěbiànwèiyuánlǜmelpyǔyīnbiānmǎqì AT liánkǎichéng kěbiànwèiyuánlǜmelpyǔyīnbiānmǎqì AT kaichenglien variablebitratemelpspeechcoder AT liánkǎichéng variablebitratemelpspeechcoder |
_version_ |
1718321711416344576 |