Summary: | Recent video coding standards typically use the Rate-distortion optimization (RDO) method, which is essential to appropriately perform mode decisions during encoding process. The newest standard high efficiency video coding (HEVC) introduces complex encoding structures and strong dependency between coding units. Particularly, the Lagrangian multiplier is a primary factor in RDO procedure, which directly affects the rate-distortion (R-D) performance and is defined for an entire video frame. This paper proposes a novel approach for perceptually guiding the RDO process in HEVC. The reference encoder does not consider effectively the perceptual characteristics of the input video and further, the visual sensitivity of each coding tree unit (CTU) in a frame. Inspired by the mechanisms of the human visual system, the proposed solution is a CTU-level adjustment of Lagrangian value based on a set of complementary perceptual features. The proposed scheme concerns important visual information of a CTU and its temporal dependency with adjacent blocks. Feature extraction is implemented in the frequency domain using efficient spatio-temporal analysis. In our experiments, we opted a perceptual mean squared error (MSE) metric and structural similarity (SSIM) index. According to perceptual MSE metric, the BD-rate savings using the Bjontegaard delta measurements, were fairly convincing over the state-of-the-art HEVC software HM16.12; 4.41% and 6.14% for random access (RA) and low delay (LD) encoding settings, respectively. Using SSIM, the BD-Rate achieved 6.95% and 9.86% for RA and LD settings, respectively. The proposed method further demonstrates a superior R-D performance over a compared approach adopting a similar scheme.
|