Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme

博士 === 國立成功大學 === 電機工程學系碩博士班 === 92 ===   Multi-threaded distributed shared memory (DSM) system consists of computer cluster on the network and provides a parallel computing environment. In the past, most DSM systems divided the computation by distributing the threads equally to every node. If the C...

Full description

Bibliographic Details
Main Authors: Yi-Chang Zhuang, 莊宜璋
Other Authors: Tyng-Yeu Liang
Format: Others
Language:en_US
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/90527844641598442325
Description
Summary:博士 === 國立成功大學 === 電機工程學系碩博士班 === 92 ===   Multi-threaded distributed shared memory (DSM) system consists of computer cluster on the network and provides a parallel computing environment. In the past, most DSM systems divided the computation by distributing the threads equally to every node. If the CPU performance of each node is different, the node with the lowest processing power will be the bottleneck of the system performance and becomes the factor to dominate the execution time of the program. Past dynamic load balance mechanisms employed thread migration to balance the system workload and to reduce the program execution time. At run time, the mechanism redistributed the working threads according to the performance of each node. However, we find that there is a problem with using thread migration for the balance of system workload. Although thread migration can effectively balance system workload at run time, it will probably result in large amount of inter-node communication if the threads sharing the same data are redistributed to different nodes. Therefore, the amount of network message will increase substantially and has great impact on system performance. To solve this problem, we propose a group-based load balance scheme in this research. The proposed scheme takes the data sharing factor into consideration to reduce the network communication resulted from thread migration.   In addition, using more nodes to execute program cannot necessarily guarantee to obtain better performance on DSM systems. Since DSM has to propagate the update of program data among the system nodes, the amount of network message will be proportional to the number of nodes in the system. If the number of nodes is greater than a threshold, the time spent in network communication will be more than the execution time reduced by parallel computation. Execution time of the program will not reduce as the number of nodes increases. Consequently, it is critical to find the best system scale for the performance of a DSM program. In this paper, we propose a model to characterize the program execution behavior on DSM. Besides, we use the run time information of the program and the DSM system to predict the system performance. According to the predicted results, the reconfiguration mechanism adapts system configuration dynamically and redistributes the working threads by group-based load balance scheme to obtain the best system performance. We implement and test the proposed mechanism on a DSM platform, Cohesion. The experimental results show that the system configuration with group-based workload redistribution scheme is necessary for the performance improvement of a DSM program. It can not only reduce the program execution time but it also can increase the utilization of the computing resource of the system.