Efficient Coding Strategies for Advanced Audio Coding
博士 === 國立交通大學 === 電子工程系所 === 93 === The Advanced Audio Coding (AAC) is a recent, high performance and sophisticated audio coder specified by the ISO/IEC MPEG Standard Committee. Because the design of encoder in AAC standard is non-normative, the coding performance is greatly influenced by the design...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2005
|
Online Access: | http://ndltd.ncl.edu.tw/handle/21330973882282279787 |
id |
ndltd-TW-093NCTU5428062 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-093NCTU54280622016-06-06T04:10:44Z http://ndltd.ncl.edu.tw/handle/21330973882282279787 Efficient Coding Strategies for Advanced Audio Coding 用於先進音訊編碼之高效率編碼策略 Cheng-Han Yang 楊政翰 博士 國立交通大學 電子工程系所 93 The Advanced Audio Coding (AAC) is a recent, high performance and sophisticated audio coder specified by the ISO/IEC MPEG Standard Committee. Because the design of encoder in AAC standard is non-normative, the coding performance is greatly influenced by the design of the coding modules (tools) in an AAC encoder. One critical element contributing to a good AAC encoder is a properly designed rate-distortion (R-D) control algorithm. This and its related issues will be the focus of this dissertation. One well-known R-D control algorithm designed for AAC is the trellis-based algorithm. It performs the trellis search through entire frame for finding proper coding parameters. It can achieve a praiseworthy performance, but their computational complexity is extremely high. The first contribution of this dissertation is the design of two types of low complexity and high performance rate-distortion control algorithms, which are Cascaded Trellis-Based (CTB) algorithm and Enhanced BFOS (EBFOS) algorithm. In the first type of the proposed algorithms, CTB, we efficiently reduce the computational burden of the trellis-based algorithms by splitting the heavy calculation stage in the trellis-based approach into two consecutive steps with much less computation. In addition, the complexity is further reduced by decreasing significantly the number of candidates in the trellis search. In the second type of proposed algorithms, EBFOS, instead of performing the trellis search through the entire frame, we allocate the bits to the most needed band step by step. In this approach, we consider both the “bit-use efficiency” at band-level and the inter-band dependency of the coding process in AAC. Simulation results show that the coding performance of the proposed two types of rate-distortion control algorithms is significantly better than that of the AAC Verification Model and is close to that of the original high-cost trellis-based algorithms. Roughly, the proposed algorithms require less than 1/140 complexity in computation when it is compared to the original trellis-based algorithms. Despite the success of current audio coding techniques, little effort has been made to reduce the inter-channel redundancy inherent in multichannel audio compression. The second contribution of this dissertation is to develop an efficient algorithm for removing inter-channel redundancy in perceptual audio coding. In our approach, the perceptually weighted inter-channel prediction is applied to the Modified Discrete Cosine Transform (MDCT) coefficients. Based on this basic structure, two types of inter-channel predictor are proposed, the time-signal based predictor and the spectral-coefficient based predictor. Similar to the existing INT-DCT based approach, no extra perceptual masking control is needed for our approach; in the meanwhile, no audio quality degradation will be induced by our method. The bit rate reduction of our method is about 10% or higher than that of the INT-DCT based approach for most typical audio sequences. Hsueh-Ming Hang 杭學鳴 2005 學位論文 ; thesis 82 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立交通大學 === 電子工程系所 === 93 === The Advanced Audio Coding (AAC) is a recent, high performance and sophisticated audio coder specified by the ISO/IEC MPEG Standard Committee. Because the design of encoder in AAC standard is non-normative, the coding performance is greatly influenced by the design of the coding modules (tools) in an AAC encoder. One critical element contributing to a good AAC encoder is a properly designed rate-distortion (R-D) control algorithm. This and its related issues will be the focus of this dissertation.
One well-known R-D control algorithm designed for AAC is the trellis-based algorithm. It performs the trellis search through entire frame for finding proper coding parameters. It can achieve a praiseworthy performance, but their computational complexity is extremely high. The first contribution of this dissertation is the design of two types of low complexity and high performance rate-distortion control algorithms, which are Cascaded Trellis-Based (CTB) algorithm and Enhanced BFOS (EBFOS) algorithm. In the first type of the proposed algorithms, CTB, we efficiently reduce the computational burden of the trellis-based algorithms by splitting the heavy calculation stage in the trellis-based approach into two consecutive steps with much less computation. In addition, the complexity is further reduced by decreasing significantly the number of candidates in the trellis search. In the second type of proposed algorithms, EBFOS, instead of performing the trellis search through the entire frame, we allocate the bits to the most needed band step by step. In this approach, we consider both the “bit-use efficiency” at band-level and the inter-band dependency of the coding process in AAC. Simulation results show that the coding performance of the proposed two types of rate-distortion control algorithms is significantly better than that of the AAC Verification Model and is close to that of the original high-cost trellis-based algorithms. Roughly, the proposed algorithms require less than 1/140 complexity in computation when it is compared to the original trellis-based algorithms.
Despite the success of current audio coding techniques, little effort has been made to reduce the inter-channel redundancy inherent in multichannel audio compression. The second contribution of this dissertation is to develop an efficient algorithm for removing inter-channel redundancy in perceptual audio coding. In our approach, the perceptually weighted inter-channel prediction is applied to the Modified Discrete Cosine Transform (MDCT) coefficients. Based on this basic structure, two types of inter-channel predictor are proposed, the time-signal based predictor and the spectral-coefficient based predictor. Similar to the existing INT-DCT based approach, no extra perceptual masking control is needed for our approach; in the meanwhile, no audio quality degradation will be induced by our method. The bit rate reduction of our method is about 10% or higher than that of the INT-DCT based approach for most typical audio sequences.
|
author2 |
Hsueh-Ming Hang |
author_facet |
Hsueh-Ming Hang Cheng-Han Yang 楊政翰 |
author |
Cheng-Han Yang 楊政翰 |
spellingShingle |
Cheng-Han Yang 楊政翰 Efficient Coding Strategies for Advanced Audio Coding |
author_sort |
Cheng-Han Yang |
title |
Efficient Coding Strategies for Advanced Audio Coding |
title_short |
Efficient Coding Strategies for Advanced Audio Coding |
title_full |
Efficient Coding Strategies for Advanced Audio Coding |
title_fullStr |
Efficient Coding Strategies for Advanced Audio Coding |
title_full_unstemmed |
Efficient Coding Strategies for Advanced Audio Coding |
title_sort |
efficient coding strategies for advanced audio coding |
publishDate |
2005 |
url |
http://ndltd.ncl.edu.tw/handle/21330973882282279787 |
work_keys_str_mv |
AT chenghanyang efficientcodingstrategiesforadvancedaudiocoding AT yángzhènghàn efficientcodingstrategiesforadvancedaudiocoding AT chenghanyang yòngyúxiānjìnyīnxùnbiānmǎzhīgāoxiàolǜbiānmǎcèlüè AT yángzhènghàn yòngyúxiānjìnyīnxùnbiānmǎzhīgāoxiàolǜbiānmǎcèlüè |
_version_ |
1718294481407574016 |