Broadcast and Reduction Scheduling Optimization for Heterogeneous Systems

碩士 === 國立中正大學 === 資訊工程研究所 === 89 === Network of workstation (NOW) is a cost-effective alternative to massively parallel supercomputers. As commercially available off-the-shelf processors become cheaper and faster, it is now possible to build a PC or works...

Full description

Bibliographic Details
Main Authors: Tzu-Hao Shen, 沈子皓
Other Authors: Pangfeng Liu
Format: Others
Language:en_US
Published: 2001
Online Access:http://ndltd.ncl.edu.tw/handle/39258009541701903955
Description
Summary:碩士 === 國立中正大學 === 資訊工程研究所 === 89 === Network of workstation (NOW) is a cost-effective alternative to massively parallel supercomputers. As commercially available off-the-shelf processors become cheaper and faster, it is now possible to build a PC or workstation cluster that provides high computing power within a limited budget. However, a cluster may consist of different types of processors and this heterogeneity within a cluster complicates the design of efficient collective communication protocols. This dissertation shows that a simple heuristic called {fastest-node-first} (FNF) is very effective in reducing broadcast time for heterogeneous cluster systems. Despite the fact that FNF heuristic fails to give the optimal broadcast time for a general heterogeneous network of workstation, we prove that FNF always gives the optimal broadcast time in several special cases of clusters. Based on these special case results, we show that FNF is an approximation algorithm that guarantees a competitive ratio of 2. From these theoretical results we also derive techniques to speed up the branch-and-bound search for the optimal broadcast schedule in HNOW. Also we show that a simple algorithm called { slowest-node-first} (SNF) is a very efficient reduction protocol for heterogeneous clusters. First, we show that SNF is actually an approximation algorithm with competitive ratio two. In addition, we show that SNF does give the optimal reduction time when the cluster consists of two types of processors, and the communication speed ratio between them is at least two. Finally we apply these theoretical results to branch-and-bound search and show that they can reduce the search time by a factor of 500.