A Performance Tuning Methodology with Compiler Support

We have developed an environment, based upon robust, existing, open source software, for tuning applications written using MPI, OpenMP or both. The goal of this effort, which integrates the OpenUH compiler and several popular performance tools, is to increase user productivity by providing an automa...

Full description

Bibliographic Details
Main Authors: Oscar Hernandez, Barbara Chapman, Haoqiang Jin
Format: Article
Language:English
Published: Hindawi Limited 2008-01-01
Series:Scientific Programming
Online Access:http://dx.doi.org/10.3233/SPR-2008-0253
id doaj-3b3e8b85f56d471587c7b4b4cf11ed8d
record_format Article
spelling doaj-3b3e8b85f56d471587c7b4b4cf11ed8d2021-07-02T04:14:39ZengHindawi LimitedScientific Programming1058-92441875-919X2008-01-01162-313515310.3233/SPR-2008-0253A Performance Tuning Methodology with Compiler SupportOscar Hernandez0Barbara Chapman1Haoqiang Jin2Computer Science Department, University of Houston, Houston, TX, USAComputer Science Department, University of Houston, Houston, TX, USANASA Advanced Supercomputing Division, NASA Ames Research Center, Moffet Field, CA, USAWe have developed an environment, based upon robust, existing, open source software, for tuning applications written using MPI, OpenMP or both. The goal of this effort, which integrates the OpenUH compiler and several popular performance tools, is to increase user productivity by providing an automated, scalable performance measurement and optimization system. In this paper we describe our environment, show how these complementary tools can work together, and illustrate the synergies possible by exploiting their individual strengths and combined interactions. We also present a methodology for performance tuning that is enabled by this environment. One of the benefits of using compiler technology in this context is that it can direct the performance measurements to capture events at different levels of granularity and help assess their importance, which we have shown to significantly reduce the measurement overheads. The compiler can also help when attempting to understand the performance results: it can supply information on how a code was translated and whether optimizations were applied. Our methodology combines two performance views of the application to find bottlenecks. The first is a high level view that focuses on OpenMP/MPI performance problems such as synchronization cost and load imbalances; the second is a low level view that focuses on hardware counter analysis with derived metrics that assess the efficiency of the code. Our experiments have shown that our approach can significantly reduce overheads for both profiling and tracing to acceptable levels and limit the number of times the application needs to be run with selected hardware counters. In this paper, we demonstrate the workings of this methodology by illustrating its use with selected NAS Parallel Benchmarks and a cloud resolving code.http://dx.doi.org/10.3233/SPR-2008-0253
collection DOAJ
language English
format Article
sources DOAJ
author Oscar Hernandez
Barbara Chapman
Haoqiang Jin
spellingShingle Oscar Hernandez
Barbara Chapman
Haoqiang Jin
A Performance Tuning Methodology with Compiler Support
Scientific Programming
author_facet Oscar Hernandez
Barbara Chapman
Haoqiang Jin
author_sort Oscar Hernandez
title A Performance Tuning Methodology with Compiler Support
title_short A Performance Tuning Methodology with Compiler Support
title_full A Performance Tuning Methodology with Compiler Support
title_fullStr A Performance Tuning Methodology with Compiler Support
title_full_unstemmed A Performance Tuning Methodology with Compiler Support
title_sort performance tuning methodology with compiler support
publisher Hindawi Limited
series Scientific Programming
issn 1058-9244
1875-919X
publishDate 2008-01-01
description We have developed an environment, based upon robust, existing, open source software, for tuning applications written using MPI, OpenMP or both. The goal of this effort, which integrates the OpenUH compiler and several popular performance tools, is to increase user productivity by providing an automated, scalable performance measurement and optimization system. In this paper we describe our environment, show how these complementary tools can work together, and illustrate the synergies possible by exploiting their individual strengths and combined interactions. We also present a methodology for performance tuning that is enabled by this environment. One of the benefits of using compiler technology in this context is that it can direct the performance measurements to capture events at different levels of granularity and help assess their importance, which we have shown to significantly reduce the measurement overheads. The compiler can also help when attempting to understand the performance results: it can supply information on how a code was translated and whether optimizations were applied. Our methodology combines two performance views of the application to find bottlenecks. The first is a high level view that focuses on OpenMP/MPI performance problems such as synchronization cost and load imbalances; the second is a low level view that focuses on hardware counter analysis with derived metrics that assess the efficiency of the code. Our experiments have shown that our approach can significantly reduce overheads for both profiling and tracing to acceptable levels and limit the number of times the application needs to be run with selected hardware counters. In this paper, we demonstrate the workings of this methodology by illustrating its use with selected NAS Parallel Benchmarks and a cloud resolving code.
url http://dx.doi.org/10.3233/SPR-2008-0253
work_keys_str_mv AT oscarhernandez aperformancetuningmethodologywithcompilersupport
AT barbarachapman aperformancetuningmethodologywithcompilersupport
AT haoqiangjin aperformancetuningmethodologywithcompilersupport
AT oscarhernandez performancetuningmethodologywithcompilersupport
AT barbarachapman performancetuningmethodologywithcompilersupport
AT haoqiangjin performancetuningmethodologywithcompilersupport
_version_ 1721340462861647872