Summary: | 博士 === 國立清華大學 === 資訊工程學系 === 99 === Context-based Adaptive Binary Arithmetic Coding (CABAC) adopted by H.264/AVC main profile achieves high compression ratio in comparison with a traditional variable-length coding. However, it incurs high computational complexity and its throughput is limited by bit-level data dependency. Moreover, for ultra high resolution applications, e.g. QFHD (3840×2160), a partially hardwired architecture cannot meet the real-time requirement. Therefore, it is necessary to implement the CABAC function in a fully hardwired architecture.
After analyzing the syntax elements (SE) distribution and the bin distribution of different types of SEs, we found that bins of coefficient block SEs and motion vector difference (MVD) SEs account for most of the bins. Therefore, we realized that our design must process these SEs efficiently. Furthermore, by analyzing the data dependency of CABAC algorithm, we concluded that a pipelined architecture is suitable for the binary arithmetic encoder (BAE), but not for the binary arithmetic decoder (BAD).
For the CABAC encoder, we designed a six-stage pipelined BAE which can encode up to eight bins per cycle. In order to keep up with the BAE throughput, we propose several acceleration methods to speed up the generation of bins and context indices. We further propose a novel architecture that shortens the critical path of renormalization and bit-stream generation. Simulation results show that our design can encode 1.33 bins per cycle, and it achieves a throughput of 295 Mbin/sec. It can real-time encode QFHD (3840×2160) video at 30fps for H.264/AVC main profile, level 5.1 when running at 222 MHz. We have successfully integrated the proposed CABAC encoder into an H.264/AVC encoder system.
For the CABAC decoder, we propose a Two-Bin BAD engine to generate two bins in one cycle for the frequent SEs. In order to boost the BAD utilization, we propose a prediction method to enhance the prediction accuracy of the second bin. Furthermore, we reallocate the context memory to shorten the critical path delay of the Two-Bin BAD circuit. Experimental results show that our CABAC decoder can generate 1.25 bins per cycle. Its throughput is capable of real-time decoding QFHD video for H.264/AVC main profile, level 5.1 when running at 238 MHz. We have successfully integrated the proposed CABAC decoder into an H.264/AVC decoder system.
|