P3T+: A Performance Estimator for Distributed and Parallel Programs

Developing distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In...

Full description

Bibliographic Details
Main Authors: T. Fahringer, A. Požgaj
Format: Article
Language:English
Published: Hindawi Limited 2000-01-01
Series:Scientific Programming
Online Access:http://dx.doi.org/10.1155/2000/217384
id doaj-686f318591bb4119ad986d2c066a8f81
record_format Article
spelling doaj-686f318591bb4119ad986d2c066a8f812021-07-02T02:57:29ZengHindawi LimitedScientific Programming1058-92441875-919X2000-01-0182739310.1155/2000/217384P3T+: A Performance Estimator for Distributed and Parallel ProgramsT. Fahringer0A. Požgaj1Institute for Software Science, University of Vienna, Liechtensteinstrasse 22, A-1090 Vienna, AustriaInstitute for Software Science, University of Vienna, Liechtensteinstrasse 22, A-1090 Vienna, AustriaDeveloping distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In this paper we introduce P3T+ which is a performance estimator for mostly regular HPF (High Performance Fortran) programs but partially covers also message passing programs (MPI). P3T+ is unique by modeling programs, compiler code transformations, and parallel and distributed architectures. It computes at compile-time a variety of performance parameters including work distribution, number of transfers, amount of data transferred, transfer times, computation times, and number of cache misses. Several novel technologies are employed to compute these parameters: loop iteration spaces, array access patterns, and data distributions are modeled by employing highly effective symbolic analysis. Communication is estimated by simulating the behavior of a communication library used by the underlying compiler. Computation times are predicted through pre-measured kernels on every target architecture of interest. We carefully model most critical architecture specific factors such as cache lines sizes, number of cache lines available, startup times, message transfer time per byte, etc. P3T+ has been implemented and is closely integrated with the Vienna High Performance Compiler (VFC) to support programmers develop parallel and distributed applications. Experimental results for realistic kernel codes taken from real-world applications are presented to demonstrate both accuracy and usefulness of P3T+.http://dx.doi.org/10.1155/2000/217384
collection DOAJ
language English
format Article
sources DOAJ
author T. Fahringer
A. Požgaj
spellingShingle T. Fahringer
A. Požgaj
P3T+: A Performance Estimator for Distributed and Parallel Programs
Scientific Programming
author_facet T. Fahringer
A. Požgaj
author_sort T. Fahringer
title P3T+: A Performance Estimator for Distributed and Parallel Programs
title_short P3T+: A Performance Estimator for Distributed and Parallel Programs
title_full P3T+: A Performance Estimator for Distributed and Parallel Programs
title_fullStr P3T+: A Performance Estimator for Distributed and Parallel Programs
title_full_unstemmed P3T+: A Performance Estimator for Distributed and Parallel Programs
title_sort p3t+: a performance estimator for distributed and parallel programs
publisher Hindawi Limited
series Scientific Programming
issn 1058-9244
1875-919X
publishDate 2000-01-01
description Developing distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In this paper we introduce P3T+ which is a performance estimator for mostly regular HPF (High Performance Fortran) programs but partially covers also message passing programs (MPI). P3T+ is unique by modeling programs, compiler code transformations, and parallel and distributed architectures. It computes at compile-time a variety of performance parameters including work distribution, number of transfers, amount of data transferred, transfer times, computation times, and number of cache misses. Several novel technologies are employed to compute these parameters: loop iteration spaces, array access patterns, and data distributions are modeled by employing highly effective symbolic analysis. Communication is estimated by simulating the behavior of a communication library used by the underlying compiler. Computation times are predicted through pre-measured kernels on every target architecture of interest. We carefully model most critical architecture specific factors such as cache lines sizes, number of cache lines available, startup times, message transfer time per byte, etc. P3T+ has been implemented and is closely integrated with the Vienna High Performance Compiler (VFC) to support programmers develop parallel and distributed applications. Experimental results for realistic kernel codes taken from real-world applications are presented to demonstrate both accuracy and usefulness of P3T+.
url http://dx.doi.org/10.1155/2000/217384
work_keys_str_mv AT tfahringer p3taperformanceestimatorfordistributedandparallelprograms
AT apozgaj p3taperformanceestimatorfordistributedandparallelprograms
_version_ 1721342499015884800