Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor

The multidimensional positive definite advection transport algorithm (MPDATA) belongs to the group of nonoscillatory forward-in-time algorithms and performs a sequence of stencil computations. MPDATA is one of the major parts of the dynamic core of the EULAG geophysical model. In this work, we outli...

Full description

Bibliographic Details
Main Authors:	Lukasz Szustak, Krzysztof Rojek, Tomasz Olas, Lukasz Kuczynski, Kamil Halbiniak, Pawel Gepner
Format:	Article
Language:	English
Published:	Hindawi Limited 2015-01-01
Series:	Scientific Programming
Online Access:	http://dx.doi.org/10.1155/2015/642705

id	doaj-b85f21087c314769baf23c230ffe09e7
record_format	Article
spelling	doaj-b85f21087c314769baf23c230ffe09e72021-07-02T02:13:38ZengHindawi LimitedScientific Programming1058-92441875-919X2015-01-01201510.1155/2015/642705642705Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi CoprocessorLukasz Szustak0Krzysztof Rojek1Tomasz Olas2Lukasz Kuczynski3Kamil Halbiniak4Pawel Gepner5Czestochowa University of Technology, Częstochowa, PolandCzestochowa University of Technology, Częstochowa, PolandCzestochowa University of Technology, Częstochowa, PolandCzestochowa University of Technology, Częstochowa, PolandCzestochowa University of Technology, Częstochowa, PolandIntel Corporation, Pipers Way, Swindon, Wiltshire SN3 1RJ, UKThe multidimensional positive definite advection transport algorithm (MPDATA) belongs to the group of nonoscillatory forward-in-time algorithms and performs a sequence of stencil computations. MPDATA is one of the major parts of the dynamic core of the EULAG geophysical model. In this work, we outline an approach to adaptation of the 3D MPDATA algorithm to the Intel MIC architecture. In order to utilize available computing resources, we propose the (3 + 1)D decomposition of MPDATA heterogeneous stencil computations. This approach is based on combination of the loop tiling and fusion techniques. It allows us to ease memory/communication bounds and better exploit the theoretical floating point efficiency of target computing platforms. An important method of improving the efficiency of the (3 + 1)D decomposition is partitioning of available cores/threads into work teams. It permits for reducing inter-cache communication overheads. This method also increases opportunities for the efficient distribution of MPDATA computation onto available resources of the Intel MIC architecture, as well as Intel CPUs. We discuss preliminary performance results obtained on two hybrid platforms, containing two CPUs and Intel Xeon Phi. The top-of-the-line Intel Xeon Phi 7120P gives the best performance results, and executes MPDATA almost 2 times faster than two Intel Xeon E5-2697v2 CPUs.http://dx.doi.org/10.1155/2015/642705
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Lukasz Szustak Krzysztof Rojek Tomasz Olas Lukasz Kuczynski Kamil Halbiniak Pawel Gepner
spellingShingle	Lukasz Szustak Krzysztof Rojek Tomasz Olas Lukasz Kuczynski Kamil Halbiniak Pawel Gepner Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor Scientific Programming
author_facet	Lukasz Szustak Krzysztof Rojek Tomasz Olas Lukasz Kuczynski Kamil Halbiniak Pawel Gepner
author_sort	Lukasz Szustak
title	Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor
title_short	Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor
title_full	Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor
title_fullStr	Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor
title_full_unstemmed	Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor
title_sort	adaptation of mpdata heterogeneous stencil computation to intel xeon phi coprocessor
publisher	Hindawi Limited
series	Scientific Programming
issn	1058-9244 1875-919X
publishDate	2015-01-01
description	The multidimensional positive definite advection transport algorithm (MPDATA) belongs to the group of nonoscillatory forward-in-time algorithms and performs a sequence of stencil computations. MPDATA is one of the major parts of the dynamic core of the EULAG geophysical model. In this work, we outline an approach to adaptation of the 3D MPDATA algorithm to the Intel MIC architecture. In order to utilize available computing resources, we propose the (3 + 1)D decomposition of MPDATA heterogeneous stencil computations. This approach is based on combination of the loop tiling and fusion techniques. It allows us to ease memory/communication bounds and better exploit the theoretical floating point efficiency of target computing platforms. An important method of improving the efficiency of the (3 + 1)D decomposition is partitioning of available cores/threads into work teams. It permits for reducing inter-cache communication overheads. This method also increases opportunities for the efficient distribution of MPDATA computation onto available resources of the Intel MIC architecture, as well as Intel CPUs. We discuss preliminary performance results obtained on two hybrid platforms, containing two CPUs and Intel Xeon Phi. The top-of-the-line Intel Xeon Phi 7120P gives the best performance results, and executes MPDATA almost 2 times faster than two Intel Xeon E5-2697v2 CPUs.
url	http://dx.doi.org/10.1155/2015/642705
work_keys_str_mv	AT lukaszszustak adaptationofmpdataheterogeneousstencilcomputationtointelxeonphicoprocessor AT krzysztofrojek adaptationofmpdataheterogeneousstencilcomputationtointelxeonphicoprocessor AT tomaszolas adaptationofmpdataheterogeneousstencilcomputationtointelxeonphicoprocessor AT lukaszkuczynski adaptationofmpdataheterogeneousstencilcomputationtointelxeonphicoprocessor AT kamilhalbiniak adaptationofmpdataheterogeneousstencilcomputationtointelxeonphicoprocessor AT pawelgepner adaptationofmpdataheterogeneousstencilcomputationtointelxeonphicoprocessor
_version_	1721343629452115968

Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor

Similar Items