A Study on Fast Coding Algorithm for ITU-T G.723.1 and G.729 Speech Codecs

碩士 === 南台科技大學 === 資訊工程系 === 98 === Speech communication is the most common service in the Internet telecommunication and multimedia process. However, since speech signal should be continuously sent back, the voice in the service of the Internet should collect enough speech data, which can cause larg...

Full description

Bibliographic Details
Main Authors: Jia-Yu Wang, 王嘉宇
Other Authors: Rong-San Lin
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/72260792424405854470
Description
Summary:碩士 === 南台科技大學 === 資訊工程系 === 98 === Speech communication is the most common service in the Internet telecommunication and multimedia process. However, since speech signal should be continuously sent back, the voice in the service of the Internet should collect enough speech data, which can cause large speech delay and can degrade the speech quality in a limited network bandwidth. To achieve "continuity", speech codec with high compression rate has been used to generate a low-rate data stream, but that codec requires higher computational complexity. Thus, reducing the bit rate and improving speech quality of codec is the most significant. ITU-T offers the G.723.1and G.729 codecs that have used popularly in the Internet applications. These codecs offer high quality and low bit rate coding constitution. This paper predict the search range of adaptive codebook-gain in the G.723.1 standard codec by minimizing the mean square error between the three-tap excitation signal with its residual signal and one-tap pitch predictor. For the G.723.1 MP-MLQ, we propose a fast search algorithm by using a designed energy function and the multi-track positions structure of the stochastic excitation signals to predict the candidate pulses for each subframe. As for both of the G.723.1 and the G.729 ACELP codebook, we base on depth-first tree search (DFS) and pulse-position likelihood-estimate to propose a fast search algorithm. As the two encoders belong to CELP coding structure, transcoding procedures are completed through two processes: line spectral pair and pitch conversions. They are all used to linear interpolation processing. For further computational complexity reduction, we use two fast search algorithms. First, we employ residual signals to predict candidate gain-vectors of adaptive-codebook in the G.723.1. Next, we adopt fast stochastic excitation pulses search method. Simulation results show that the proposed methods reduce a large amount of computation. Also, reconstructed speech signal still maintain a certain level of speech quality with perceptually negligible degradation.