Optimizing Fortran90D/HPF for distributed-memory computers

High performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and u...

Full description

Bibliographic Details
Main Author: Roth, Gerald H.
Format: Others
Language:English
Published: 2007
Subjects:
Online Access:http://hdl.handle.net/1911/19201
id ndltd-RICE-oai-scholarship.rice.edu-1911-19201
record_format oai_dc
spelling ndltd-RICE-oai-scholarship.rice.edu-1911-192012013-10-23T04:08:16ZOptimizing Fortran90D/HPF for distributed-memory computersRoth, Gerald H.Computer ScienceHigh performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and use data layout directives to direct the partitioning of the data and computation among the processors of a parallel machine. For HPF to gain acceptance as a vehicle for parallel scientific programming, it must achieve high performance on problems for which it is well suited. To achieve high performance with an HPF program on a distributed-memory parallel machine, an HPF compiler must do a superb job of translating Fortran90 data-parallel array constructs into an efficient sequence of operations that minimize the overhead associated with data movement and also maximize data locality. This dissertation presents and analyzes a set of advanced optimizations designed to improve the execution performance of HPF programs on distributed-memory architectures. Presented is a methodology for performing deep analysis of Fortran90 programs, eliminating the reliance upon pattern matching to drive the optimizations as is done in many Fortran90 compilers. The optimizations address the overhead of data movement, both interprocessor and intraprocessor movement, that results from the translation of Fortran90 array constructs. Additional optimizations address the issues of scalarizing array assignment statements, loop fusion, and data locality. The combination of these optimizations results in a compiler that is capable of optimizing dense matrix stencil computations more completely than all previous efforts in this area. This work is distinguished by advanced compile-time analysis and optimizations performed at the whole-array level as opposed to analysis and optimization performed at the loop or array-element levels.2007-08-21T01:49:05Z2007-08-21T01:49:05Z1997ThesisTextapplication/pdfhttp://hdl.handle.net/1911/19201eng
collection NDLTD
language English
format Others
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Roth, Gerald H.
Optimizing Fortran90D/HPF for distributed-memory computers
description High performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and use data layout directives to direct the partitioning of the data and computation among the processors of a parallel machine. For HPF to gain acceptance as a vehicle for parallel scientific programming, it must achieve high performance on problems for which it is well suited. To achieve high performance with an HPF program on a distributed-memory parallel machine, an HPF compiler must do a superb job of translating Fortran90 data-parallel array constructs into an efficient sequence of operations that minimize the overhead associated with data movement and also maximize data locality. This dissertation presents and analyzes a set of advanced optimizations designed to improve the execution performance of HPF programs on distributed-memory architectures. Presented is a methodology for performing deep analysis of Fortran90 programs, eliminating the reliance upon pattern matching to drive the optimizations as is done in many Fortran90 compilers. The optimizations address the overhead of data movement, both interprocessor and intraprocessor movement, that results from the translation of Fortran90 array constructs. Additional optimizations address the issues of scalarizing array assignment statements, loop fusion, and data locality. The combination of these optimizations results in a compiler that is capable of optimizing dense matrix stencil computations more completely than all previous efforts in this area. This work is distinguished by advanced compile-time analysis and optimizations performed at the whole-array level as opposed to analysis and optimization performed at the loop or array-element levels.
author Roth, Gerald H.
author_facet Roth, Gerald H.
author_sort Roth, Gerald H.
title Optimizing Fortran90D/HPF for distributed-memory computers
title_short Optimizing Fortran90D/HPF for distributed-memory computers
title_full Optimizing Fortran90D/HPF for distributed-memory computers
title_fullStr Optimizing Fortran90D/HPF for distributed-memory computers
title_full_unstemmed Optimizing Fortran90D/HPF for distributed-memory computers
title_sort optimizing fortran90d/hpf for distributed-memory computers
publishDate 2007
url http://hdl.handle.net/1911/19201
work_keys_str_mv AT rothgeraldh optimizingfortran90dhpffordistributedmemorycomputers
_version_ 1716610235600207872