A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC

We propose a highly parallel and scalable motion estimation algorithm, named multilevel resolution motion estimation (MLRME for short), by combining the advantages of local full search and downsampling. By subsampling a video frame, a large amount of computation is saved. While using the local full-...

Full description

Bibliographic Details
Main Authors:	Yun-gang Xue, Hua-you Su, Ju Ren, Mei Wen, Chun-yuan Zhang, Li-quan Xiao
Format:	Article
Language:	English
Published:	Hindawi Limited 2017-01-01
Series:	Scientific Programming
Online Access:	http://dx.doi.org/10.1155/2017/1431574

id	doaj-45daba5e9f6640a6ab63ed775f7bc59b
record_format	Article
spelling	doaj-45daba5e9f6640a6ab63ed775f7bc59b2021-07-02T02:04:35ZengHindawi LimitedScientific Programming1058-92441875-919X2017-01-01201710.1155/2017/14315741431574A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVCYun-gang Xue0Hua-you Su1Ju Ren2Mei Wen3Chun-yuan Zhang4Li-quan Xiao5School of Computer, National University of Defense Technology, Changsha 410073, ChinaSchool of Computer, National University of Defense Technology, Changsha 410073, ChinaSchool of Computer, National University of Defense Technology, Changsha 410073, ChinaSchool of Computer, National University of Defense Technology, Changsha 410073, ChinaSchool of Computer, National University of Defense Technology, Changsha 410073, ChinaSchool of Computer, National University of Defense Technology, Changsha 410073, ChinaWe propose a highly parallel and scalable motion estimation algorithm, named multilevel resolution motion estimation (MLRME for short), by combining the advantages of local full search and downsampling. By subsampling a video frame, a large amount of computation is saved. While using the local full-search method, it can exploit massive parallelism and make full use of the powerful modern many-core accelerators, such as GPU and Intel Xeon Phi. We implanted the proposed MLRME into HM12.0, and the experimental results showed that the encoding quality of the MLRME method is close to that of the fast motion estimation in HEVC, which declines by less than 1.5%. We also implemented the MLRME with CUDA, which obtained 30–60x speed-up compared to the serial algorithm on single CPU. Specifically, the parallel implementation of MLRME on a GTX 460 GPU can meet the real-time coding requirement with about 25 fps for the 2560×1600 video format, while, for 832×480, the performance is more than 100 fps.http://dx.doi.org/10.1155/2017/1431574
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Yun-gang Xue Hua-you Su Ju Ren Mei Wen Chun-yuan Zhang Li-quan Xiao
spellingShingle	Yun-gang Xue Hua-you Su Ju Ren Mei Wen Chun-yuan Zhang Li-quan Xiao A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC Scientific Programming
author_facet	Yun-gang Xue Hua-you Su Ju Ren Mei Wen Chun-yuan Zhang Li-quan Xiao
author_sort	Yun-gang Xue
title	A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC
title_short	A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC
title_full	A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC
title_fullStr	A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC
title_full_unstemmed	A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC
title_sort	highly parallel and scalable motion estimation algorithm with gpu for hevc
publisher	Hindawi Limited
series	Scientific Programming
issn	1058-9244 1875-919X
publishDate	2017-01-01
description	We propose a highly parallel and scalable motion estimation algorithm, named multilevel resolution motion estimation (MLRME for short), by combining the advantages of local full search and downsampling. By subsampling a video frame, a large amount of computation is saved. While using the local full-search method, it can exploit massive parallelism and make full use of the powerful modern many-core accelerators, such as GPU and Intel Xeon Phi. We implanted the proposed MLRME into HM12.0, and the experimental results showed that the encoding quality of the MLRME method is close to that of the fast motion estimation in HEVC, which declines by less than 1.5%. We also implemented the MLRME with CUDA, which obtained 30–60x speed-up compared to the serial algorithm on single CPU. Specifically, the parallel implementation of MLRME on a GTX 460 GPU can meet the real-time coding requirement with about 25 fps for the 2560×1600 video format, while, for 832×480, the performance is more than 100 fps.
url	http://dx.doi.org/10.1155/2017/1431574
work_keys_str_mv	AT yungangxue ahighlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT huayousu ahighlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT juren ahighlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT meiwen ahighlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT chunyuanzhang ahighlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT liquanxiao ahighlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT yungangxue highlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT huayousu highlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT juren highlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT meiwen highlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT chunyuanzhang highlyparallelandscalablemotionestimationalgorithmwithgpuforhevc AT liquanxiao highlyparallelandscalablemotionestimationalgorithmwithgpuforhevc
_version_	1721343814332841984

A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC

Similar Items