Summary: | 博士 === 國立交通大學 === 電信工程系 === 87 === This study focuses on two issues: perceptual enhancement of sinusoidal transform coding (STC) and optimal index assignment of vector quantization (VQ), to design a 2.4 kb/s speech coder that achieves high robustness against channel errors. STC attempts to model speech waveform as the sum of sinusoids whose frequencies, amplitudes, and phases are chosen to make the reconstruction a best fit to the original speech. The first part of this study focuses on quality enhancement of STC via the development of new parametric models. The benefits of the Bark spectrum are explored for use in the design of perceptual coding of the sine-wave amplitudes. In comparison to existing STC based on cepstral representation, the Bark-based amplitude coder is preferred because of its ability to achieve a uniform perceptual fit across the spectrum. One enhancement that further improves phase accuracy is the use of a noncausal all-pole vocal system that better matched the maximum-phase nature of differentiated glottal pulses. The next step of the present investigation was concerned with transmission of vector-quantized Bark spectrum over a noisy channel. We formulated the channel-robust VQ design as a combinatorial optimization problem leading to a search for the minimum distortion index assignment. To better track the statistical dependencies between error sequences, we propose to incorporate Markov characterization of the channel into the VQ design. Simulation results indicate that the global explorative properties of genetic algorithms make them very effective at finding the optimal index assignment and by using this index assignment the vector quantizer can be developed to respond to various channel conditions.
|