Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages
The optimization techniques for hierarchical O(N) N-body algorithms described here focus on managing the data distribution and the data references, both between the memories of different nodes and within the memory hierarchy of each node. We show how the techniques can be expressed in data-parallel...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
1996-01-01
|
Series: | Scientific Programming |
Online Access: | http://dx.doi.org/10.1155/1996/425936 |
id |
doaj-618a5f5b769544d49d8d62118a527dd6 |
---|---|
record_format |
Article |
spelling |
doaj-618a5f5b769544d49d8d62118a527dd62021-07-02T03:25:10ZengHindawi LimitedScientific Programming1058-92441875-919X1996-01-015433736410.1155/1996/425936Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel LanguagesYu Hu0S. Lennart Johnsson1Aiken Computation Laboratory, Harvard University, Cambridge, MA 02138, USAAiken Computation Laboratory, Harvard University, Cambridge, MA 02138, USAThe optimization techniques for hierarchical O(N) N-body algorithms described here focus on managing the data distribution and the data references, both between the memories of different nodes and within the memory hierarchy of each node. We show how the techniques can be expressed in data-parallel languages, such as High Performance Fortran (HPF) and Connection Machine Fortran (CMF). The effectiveness of our techniques is demonstrated on an implementation of Anderson's hierarchical O(N) N-body method for the Connection Machine system CM-5/5E. Of the total execution time, communication accounts for about 10–20% of the total time, with the average efficiency for arithmetic operations being about 40% and the total efficiency (including communication) being about 35%. For the CM-5E, a performance in excess of 60 Mflop/s per node (peak 160 Mflop/s per node) has been measured.http://dx.doi.org/10.1155/1996/425936 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yu Hu S. Lennart Johnsson |
spellingShingle |
Yu Hu S. Lennart Johnsson Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages Scientific Programming |
author_facet |
Yu Hu S. Lennart Johnsson |
author_sort |
Yu Hu |
title |
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages |
title_short |
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages |
title_full |
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages |
title_fullStr |
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages |
title_full_unstemmed |
Implementing O(N) N-Body Algorithms Efficiently in Data-Parallel Languages |
title_sort |
implementing o(n) n-body algorithms efficiently in data-parallel languages |
publisher |
Hindawi Limited |
series |
Scientific Programming |
issn |
1058-9244 1875-919X |
publishDate |
1996-01-01 |
description |
The optimization techniques for hierarchical O(N) N-body algorithms described here focus on managing the data distribution and the data references, both between the memories of different nodes and within the memory hierarchy of each node. We show how the techniques can be expressed in data-parallel languages, such as High Performance Fortran (HPF) and Connection Machine Fortran (CMF). The effectiveness of our techniques is demonstrated on an implementation of Anderson's hierarchical O(N) N-body method for the Connection Machine system CM-5/5E. Of the total execution time, communication accounts for about 10–20% of the total time, with the average efficiency for arithmetic operations being about 40% and the total efficiency (including communication) being about 35%. For the CM-5E, a performance in excess of 60 Mflop/s per node (peak 160 Mflop/s per node) has been measured. |
url |
http://dx.doi.org/10.1155/1996/425936 |
work_keys_str_mv |
AT yuhu implementingonnbodyalgorithmsefficientlyindataparallellanguages AT slennartjohnsson implementingonnbodyalgorithmsefficientlyindataparallellanguages |
_version_ |
1721341605478137856 |