Efficient Coding Strategies for Advanced Audio Coding

博士 === 國立交通大學 === 電子工程系所 === 93 === The Advanced Audio Coding (AAC) is a recent, high performance and sophisticated audio coder specified by the ISO/IEC MPEG Standard Committee. Because the design of encoder in AAC standard is non-normative, the coding performance is greatly influenced by the design...

Full description

Bibliographic Details
Main Authors: Cheng-Han Yang, 楊政翰
Other Authors: Hsueh-Ming Hang
Format: Others
Language:en_US
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/21330973882282279787
id ndltd-TW-093NCTU5428062
record_format oai_dc
spelling ndltd-TW-093NCTU54280622016-06-06T04:10:44Z http://ndltd.ncl.edu.tw/handle/21330973882282279787 Efficient Coding Strategies for Advanced Audio Coding 用於先進音訊編碼之高效率編碼策略 Cheng-Han Yang 楊政翰 博士 國立交通大學 電子工程系所 93 The Advanced Audio Coding (AAC) is a recent, high performance and sophisticated audio coder specified by the ISO/IEC MPEG Standard Committee. Because the design of encoder in AAC standard is non-normative, the coding performance is greatly influenced by the design of the coding modules (tools) in an AAC encoder. One critical element contributing to a good AAC encoder is a properly designed rate-distortion (R-D) control algorithm. This and its related issues will be the focus of this dissertation. One well-known R-D control algorithm designed for AAC is the trellis-based algorithm. It performs the trellis search through entire frame for finding proper coding parameters. It can achieve a praiseworthy performance, but their computational complexity is extremely high. The first contribution of this dissertation is the design of two types of low complexity and high performance rate-distortion control algorithms, which are Cascaded Trellis-Based (CTB) algorithm and Enhanced BFOS (EBFOS) algorithm. In the first type of the proposed algorithms, CTB, we efficiently reduce the computational burden of the trellis-based algorithms by splitting the heavy calculation stage in the trellis-based approach into two consecutive steps with much less computation. In addition, the complexity is further reduced by decreasing significantly the number of candidates in the trellis search. In the second type of proposed algorithms, EBFOS, instead of performing the trellis search through the entire frame, we allocate the bits to the most needed band step by step. In this approach, we consider both the “bit-use efficiency” at band-level and the inter-band dependency of the coding process in AAC. Simulation results show that the coding performance of the proposed two types of rate-distortion control algorithms is significantly better than that of the AAC Verification Model and is close to that of the original high-cost trellis-based algorithms. Roughly, the proposed algorithms require less than 1/140 complexity in computation when it is compared to the original trellis-based algorithms. Despite the success of current audio coding techniques, little effort has been made to reduce the inter-channel redundancy inherent in multichannel audio compression. The second contribution of this dissertation is to develop an efficient algorithm for removing inter-channel redundancy in perceptual audio coding. In our approach, the perceptually weighted inter-channel prediction is applied to the Modified Discrete Cosine Transform (MDCT) coefficients. Based on this basic structure, two types of inter-channel predictor are proposed, the time-signal based predictor and the spectral-coefficient based predictor. Similar to the existing INT-DCT based approach, no extra perceptual masking control is needed for our approach; in the meanwhile, no audio quality degradation will be induced by our method. The bit rate reduction of our method is about 10% or higher than that of the INT-DCT based approach for most typical audio sequences. Hsueh-Ming Hang 杭學鳴 2005 學位論文 ; thesis 82 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立交通大學 === 電子工程系所 === 93 === The Advanced Audio Coding (AAC) is a recent, high performance and sophisticated audio coder specified by the ISO/IEC MPEG Standard Committee. Because the design of encoder in AAC standard is non-normative, the coding performance is greatly influenced by the design of the coding modules (tools) in an AAC encoder. One critical element contributing to a good AAC encoder is a properly designed rate-distortion (R-D) control algorithm. This and its related issues will be the focus of this dissertation. One well-known R-D control algorithm designed for AAC is the trellis-based algorithm. It performs the trellis search through entire frame for finding proper coding parameters. It can achieve a praiseworthy performance, but their computational complexity is extremely high. The first contribution of this dissertation is the design of two types of low complexity and high performance rate-distortion control algorithms, which are Cascaded Trellis-Based (CTB) algorithm and Enhanced BFOS (EBFOS) algorithm. In the first type of the proposed algorithms, CTB, we efficiently reduce the computational burden of the trellis-based algorithms by splitting the heavy calculation stage in the trellis-based approach into two consecutive steps with much less computation. In addition, the complexity is further reduced by decreasing significantly the number of candidates in the trellis search. In the second type of proposed algorithms, EBFOS, instead of performing the trellis search through the entire frame, we allocate the bits to the most needed band step by step. In this approach, we consider both the “bit-use efficiency” at band-level and the inter-band dependency of the coding process in AAC. Simulation results show that the coding performance of the proposed two types of rate-distortion control algorithms is significantly better than that of the AAC Verification Model and is close to that of the original high-cost trellis-based algorithms. Roughly, the proposed algorithms require less than 1/140 complexity in computation when it is compared to the original trellis-based algorithms. Despite the success of current audio coding techniques, little effort has been made to reduce the inter-channel redundancy inherent in multichannel audio compression. The second contribution of this dissertation is to develop an efficient algorithm for removing inter-channel redundancy in perceptual audio coding. In our approach, the perceptually weighted inter-channel prediction is applied to the Modified Discrete Cosine Transform (MDCT) coefficients. Based on this basic structure, two types of inter-channel predictor are proposed, the time-signal based predictor and the spectral-coefficient based predictor. Similar to the existing INT-DCT based approach, no extra perceptual masking control is needed for our approach; in the meanwhile, no audio quality degradation will be induced by our method. The bit rate reduction of our method is about 10% or higher than that of the INT-DCT based approach for most typical audio sequences.
author2 Hsueh-Ming Hang
author_facet Hsueh-Ming Hang
Cheng-Han Yang
楊政翰
author Cheng-Han Yang
楊政翰
spellingShingle Cheng-Han Yang
楊政翰
Efficient Coding Strategies for Advanced Audio Coding
author_sort Cheng-Han Yang
title Efficient Coding Strategies for Advanced Audio Coding
title_short Efficient Coding Strategies for Advanced Audio Coding
title_full Efficient Coding Strategies for Advanced Audio Coding
title_fullStr Efficient Coding Strategies for Advanced Audio Coding
title_full_unstemmed Efficient Coding Strategies for Advanced Audio Coding
title_sort efficient coding strategies for advanced audio coding
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/21330973882282279787
work_keys_str_mv AT chenghanyang efficientcodingstrategiesforadvancedaudiocoding
AT yángzhènghàn efficientcodingstrategiesforadvancedaudiocoding
AT chenghanyang yòngyúxiānjìnyīnxùnbiānmǎzhīgāoxiàolǜbiānmǎcèlüè
AT yángzhènghàn yòngyúxiānjìnyīnxùnbiānmǎzhīgāoxiàolǜbiānmǎcèlüè
_version_ 1718294481407574016