Optimizing Fortran90D/HPF for distributed-memory computers

High performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and u...

Full description

Bibliographic Details
Main Author:	Roth, Gerald H.
Format:	Others
Language:	English
Published:	2007
Subjects:	Computer Science
Online Access:	http://hdl.handle.net/1911/19201

id	ndltd-RICE-oai-scholarship.rice.edu-1911-19201
record_format	oai_dc
spelling	ndltd-RICE-oai-scholarship.rice.edu-1911-192012013-10-23T04:08:16ZOptimizing Fortran90D/HPF for distributed-memory computersRoth, Gerald H.Computer ScienceHigh performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and use data layout directives to direct the partitioning of the data and computation among the processors of a parallel machine. For HPF to gain acceptance as a vehicle for parallel scientific programming, it must achieve high performance on problems for which it is well suited. To achieve high performance with an HPF program on a distributed-memory parallel machine, an HPF compiler must do a superb job of translating Fortran90 data-parallel array constructs into an efficient sequence of operations that minimize the overhead associated with data movement and also maximize data locality. This dissertation presents and analyzes a set of advanced optimizations designed to improve the execution performance of HPF programs on distributed-memory architectures. Presented is a methodology for performing deep analysis of Fortran90 programs, eliminating the reliance upon pattern matching to drive the optimizations as is done in many Fortran90 compilers. The optimizations address the overhead of data movement, both interprocessor and intraprocessor movement, that results from the translation of Fortran90 array constructs. Additional optimizations address the issues of scalarizing array assignment statements, loop fusion, and data locality. The combination of these optimizations results in a compiler that is capable of optimizing dense matrix stencil computations more completely than all previous efforts in this area. This work is distinguished by advanced compile-time analysis and optimizations performed at the whole-array level as opposed to analysis and optimization performed at the loop or array-element levels.2007-08-21T01:49:05Z2007-08-21T01:49:05Z1997ThesisTextapplication/pdfhttp://hdl.handle.net/1911/19201eng
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Computer Science
spellingShingle	Computer Science Roth, Gerald H. Optimizing Fortran90D/HPF for distributed-memory computers
description	High performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and use data layout directives to direct the partitioning of the data and computation among the processors of a parallel machine. For HPF to gain acceptance as a vehicle for parallel scientific programming, it must achieve high performance on problems for which it is well suited. To achieve high performance with an HPF program on a distributed-memory parallel machine, an HPF compiler must do a superb job of translating Fortran90 data-parallel array constructs into an efficient sequence of operations that minimize the overhead associated with data movement and also maximize data locality. This dissertation presents and analyzes a set of advanced optimizations designed to improve the execution performance of HPF programs on distributed-memory architectures. Presented is a methodology for performing deep analysis of Fortran90 programs, eliminating the reliance upon pattern matching to drive the optimizations as is done in many Fortran90 compilers. The optimizations address the overhead of data movement, both interprocessor and intraprocessor movement, that results from the translation of Fortran90 array constructs. Additional optimizations address the issues of scalarizing array assignment statements, loop fusion, and data locality. The combination of these optimizations results in a compiler that is capable of optimizing dense matrix stencil computations more completely than all previous efforts in this area. This work is distinguished by advanced compile-time analysis and optimizations performed at the whole-array level as opposed to analysis and optimization performed at the loop or array-element levels.
author	Roth, Gerald H.
author_facet	Roth, Gerald H.
author_sort	Roth, Gerald H.
title	Optimizing Fortran90D/HPF for distributed-memory computers
title_short	Optimizing Fortran90D/HPF for distributed-memory computers
title_full	Optimizing Fortran90D/HPF for distributed-memory computers
title_fullStr	Optimizing Fortran90D/HPF for distributed-memory computers
title_full_unstemmed	Optimizing Fortran90D/HPF for distributed-memory computers
title_sort	optimizing fortran90d/hpf for distributed-memory computers
publishDate	2007
url	http://hdl.handle.net/1911/19201
work_keys_str_mv	AT rothgeraldh optimizingfortran90dhpffordistributedmemorycomputers
_version_	1716610235600207872

Optimizing Fortran90D/HPF for distributed-memory computers

Similar Items