Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme

博士 === 國立成功大學 === 電機工程學系碩博士班 === 92 ===   Multi-threaded distributed shared memory (DSM) system consists of computer cluster on the network and provides a parallel computing environment. In the past, most DSM systems divided the computation by distributing the threads equally to every node. If the C...

Full description

Bibliographic Details
Main Authors: Yi-Chang Zhuang, 莊宜璋
Other Authors: Tyng-Yeu Liang
Format: Others
Language:en_US
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/90527844641598442325
id ndltd-TW-092NCKU5442116
record_format oai_dc
spelling ndltd-TW-092NCKU54421162016-06-17T04:16:58Z http://ndltd.ncl.edu.tw/handle/90527844641598442325 Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme 分散式共用記憶體系統採群組工作分配方式之系統重組機制 Yi-Chang Zhuang 莊宜璋 博士 國立成功大學 電機工程學系碩博士班 92   Multi-threaded distributed shared memory (DSM) system consists of computer cluster on the network and provides a parallel computing environment. In the past, most DSM systems divided the computation by distributing the threads equally to every node. If the CPU performance of each node is different, the node with the lowest processing power will be the bottleneck of the system performance and becomes the factor to dominate the execution time of the program. Past dynamic load balance mechanisms employed thread migration to balance the system workload and to reduce the program execution time. At run time, the mechanism redistributed the working threads according to the performance of each node. However, we find that there is a problem with using thread migration for the balance of system workload. Although thread migration can effectively balance system workload at run time, it will probably result in large amount of inter-node communication if the threads sharing the same data are redistributed to different nodes. Therefore, the amount of network message will increase substantially and has great impact on system performance. To solve this problem, we propose a group-based load balance scheme in this research. The proposed scheme takes the data sharing factor into consideration to reduce the network communication resulted from thread migration.   In addition, using more nodes to execute program cannot necessarily guarantee to obtain better performance on DSM systems. Since DSM has to propagate the update of program data among the system nodes, the amount of network message will be proportional to the number of nodes in the system. If the number of nodes is greater than a threshold, the time spent in network communication will be more than the execution time reduced by parallel computation. Execution time of the program will not reduce as the number of nodes increases. Consequently, it is critical to find the best system scale for the performance of a DSM program. In this paper, we propose a model to characterize the program execution behavior on DSM. Besides, we use the run time information of the program and the DSM system to predict the system performance. According to the predicted results, the reconfiguration mechanism adapts system configuration dynamically and redistributes the working threads by group-based load balance scheme to obtain the best system performance. We implement and test the proposed mechanism on a DSM platform, Cohesion. The experimental results show that the system configuration with group-based workload redistribution scheme is necessary for the performance improvement of a DSM program. It can not only reduce the program execution time but it also can increase the utilization of the computing resource of the system. Tyng-Yeu Liang Ce-Kuen Shieh 梁廷宇 謝錫堃 2004 學位論文 ; thesis 113 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立成功大學 === 電機工程學系碩博士班 === 92 ===   Multi-threaded distributed shared memory (DSM) system consists of computer cluster on the network and provides a parallel computing environment. In the past, most DSM systems divided the computation by distributing the threads equally to every node. If the CPU performance of each node is different, the node with the lowest processing power will be the bottleneck of the system performance and becomes the factor to dominate the execution time of the program. Past dynamic load balance mechanisms employed thread migration to balance the system workload and to reduce the program execution time. At run time, the mechanism redistributed the working threads according to the performance of each node. However, we find that there is a problem with using thread migration for the balance of system workload. Although thread migration can effectively balance system workload at run time, it will probably result in large amount of inter-node communication if the threads sharing the same data are redistributed to different nodes. Therefore, the amount of network message will increase substantially and has great impact on system performance. To solve this problem, we propose a group-based load balance scheme in this research. The proposed scheme takes the data sharing factor into consideration to reduce the network communication resulted from thread migration.   In addition, using more nodes to execute program cannot necessarily guarantee to obtain better performance on DSM systems. Since DSM has to propagate the update of program data among the system nodes, the amount of network message will be proportional to the number of nodes in the system. If the number of nodes is greater than a threshold, the time spent in network communication will be more than the execution time reduced by parallel computation. Execution time of the program will not reduce as the number of nodes increases. Consequently, it is critical to find the best system scale for the performance of a DSM program. In this paper, we propose a model to characterize the program execution behavior on DSM. Besides, we use the run time information of the program and the DSM system to predict the system performance. According to the predicted results, the reconfiguration mechanism adapts system configuration dynamically and redistributes the working threads by group-based load balance scheme to obtain the best system performance. We implement and test the proposed mechanism on a DSM platform, Cohesion. The experimental results show that the system configuration with group-based workload redistribution scheme is necessary for the performance improvement of a DSM program. It can not only reduce the program execution time but it also can increase the utilization of the computing resource of the system.
author2 Tyng-Yeu Liang
author_facet Tyng-Yeu Liang
Yi-Chang Zhuang
莊宜璋
author Yi-Chang Zhuang
莊宜璋
spellingShingle Yi-Chang Zhuang
莊宜璋
Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme
author_sort Yi-Chang Zhuang
title Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme
title_short Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme
title_full Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme
title_fullStr Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme
title_full_unstemmed Adapting Node Configuration on Distributed Shared Memory Systems with Group-Based Workload Redistribution Scheme
title_sort adapting node configuration on distributed shared memory systems with group-based workload redistribution scheme
publishDate 2004
url http://ndltd.ncl.edu.tw/handle/90527844641598442325
work_keys_str_mv AT yichangzhuang adaptingnodeconfigurationondistributedsharedmemorysystemswithgroupbasedworkloadredistributionscheme
AT zhuāngyízhāng adaptingnodeconfigurationondistributedsharedmemorysystemswithgroupbasedworkloadredistributionscheme
AT yichangzhuang fēnsànshìgòngyòngjìyìtǐxìtǒngcǎiqúnzǔgōngzuòfēnpèifāngshìzhīxìtǒngzhòngzǔjīzhì
AT zhuāngyízhāng fēnsànshìgòngyòngjìyìtǐxìtǒngcǎiqúnzǔgōngzuòfēnpèifāngshìzhīxìtǒngzhòngzǔjīzhì
_version_ 1718308575722340352