A Performance Study of a Dual Xeon-Phi Cluster for the Forward Modelling of Gravitational Fields

With at least 60 processing cores, the Xeon-Phi coprocessor is a truly multicore architecture, which consists of an interconnection speed among cores of 240 GB/s, two levels of cache memory, a theoretical performance of 1.01 Tflops, and programming flexibility, all making the Xeon-Phi an excellent c...

Full description

Bibliographic Details
Main Authors: Maricela Arroyo, Carlos Couder-Castañeda, Alfredo Trujillo-Alcantara, Israel-Enrique Herrera-Diaz, Nain Vera-Chavez
Format: Article
Language:English
Published: Hindawi Limited 2015-01-01
Series:Scientific Programming
Online Access:http://dx.doi.org/10.1155/2015/316012
Description
Summary:With at least 60 processing cores, the Xeon-Phi coprocessor is a truly multicore architecture, which consists of an interconnection speed among cores of 240 GB/s, two levels of cache memory, a theoretical performance of 1.01 Tflops, and programming flexibility, all making the Xeon-Phi an excellent coprocessor for parallelizing applications that seek to reduce computational times. The objective of this work is to migrate a geophysical application designed to directly calculate the gravimetric tensor components and their derivatives and in this way research the performance of one and two Xeon-Phi coprocessors integrated on the same node and distributed in various nodes. This application allows the analysis of the design factors that drive good performance and compare the results against a conventional multicore CPU. This research shows an efficient strategy based on nested parallelism using OpenMP, a design that in its outer structure acts as a controller of interconnected Xeon-Phi coprocessors while its interior is used for parallelyzing the loops. MPI is subsequently used to reduce the information among the nodes of the cluster.
ISSN:1058-9244
1875-919X