Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding

This paper presents a model-based method for coding the LSF parameters of LPC speech coders on a “long-term” basis, that is, beyond the usual 20–30 ms frame duration. The objective is to provide efficient LSF quantization for a speech coder with large delay...

Full description

Bibliographic Details
Main Author: Laurent Girin
Format: Article
Language:English
Published: SpringerOpen 2010-01-01
Series:EURASIP Journal on Audio, Speech, and Music Processing
Online Access:http://dx.doi.org/10.1155/2010/597039
id doaj-4d8c1083a3ae4de4892698fe942441a2
record_format Article
spelling doaj-4d8c1083a3ae4de4892698fe942441a22020-11-25T01:40:49ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47141687-47222010-01-01201010.1155/2010/597039Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech CodingLaurent GirinThis paper presents a model-based method for coding the LSF parameters of LPC speech coders on a “long-term” basis, that is, beyond the usual 20–30 ms frame duration. The objective is to provide efficient LSF quantization for a speech coder with large delay but very- to ultra-low bit-rate (i.e., below 1 kb/s). To do this, speech is first segmented into voiced/unvoiced segments. A Discrete Cosine model of the time trajectory of the LSF vectors is then applied to each segment to capture the LSF interframe correlation over the whole segment. Bi-directional transformation from the model coefficients to a reduced set of LSF vectors enables both efficient “sparse” coding (using here multistage vector quantizers) and the generation of interpolated LSF vectors at the decoder. The proposed method provides up to 50% gain in bit-rate over frame-by-frame quantization while preserving signal quality and competes favorably with 2D-transform coding for the lower range of tested bit rates. Moreover, the implicit time-interpolation nature of the long-term coding process provides this technique a high potential for use in speech synthesis systems. http://dx.doi.org/10.1155/2010/597039
collection DOAJ
language English
format Article
sources DOAJ
author Laurent Girin
spellingShingle Laurent Girin
Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
EURASIP Journal on Audio, Speech, and Music Processing
author_facet Laurent Girin
author_sort Laurent Girin
title Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
title_short Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
title_full Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
title_fullStr Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
title_full_unstemmed Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
title_sort adaptive long-term coding of lsf parameters trajectories for large-delay/very- to ultra-low bit-rate speech coding
publisher SpringerOpen
series EURASIP Journal on Audio, Speech, and Music Processing
issn 1687-4714
1687-4722
publishDate 2010-01-01
description This paper presents a model-based method for coding the LSF parameters of LPC speech coders on a “long-term” basis, that is, beyond the usual 20–30 ms frame duration. The objective is to provide efficient LSF quantization for a speech coder with large delay but very- to ultra-low bit-rate (i.e., below 1 kb/s). To do this, speech is first segmented into voiced/unvoiced segments. A Discrete Cosine model of the time trajectory of the LSF vectors is then applied to each segment to capture the LSF interframe correlation over the whole segment. Bi-directional transformation from the model coefficients to a reduced set of LSF vectors enables both efficient “sparse” coding (using here multistage vector quantizers) and the generation of interpolated LSF vectors at the decoder. The proposed method provides up to 50% gain in bit-rate over frame-by-frame quantization while preserving signal quality and competes favorably with 2D-transform coding for the lower range of tested bit rates. Moreover, the implicit time-interpolation nature of the long-term coding process provides this technique a high potential for use in speech synthesis systems.
url http://dx.doi.org/10.1155/2010/597039
work_keys_str_mv AT laurentgirin adaptivelongtermcodingoflsfparameterstrajectoriesforlargedelayverytoultralowbitratespeechcoding
_version_ 1725043431414169600