IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS

High Performance Computing (HPC) is the key to solving many scientific, financial, and engineering problems. Computer clusters are now the dominant architecture for HPC. The scale of clusters, both in terms of processor per node and the number of nodes, is increasing rapidly, reaching petascales the...

Full description

Bibliographic Details
Main Author: RASHTI, Mohammad Javad
Other Authors: Queen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))
Language:en
en
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/1974/6284
id ndltd-LACETR-oai-collectionscanada.gc.ca-OKQ.1974-6284
record_format oai_dc
spelling ndltd-LACETR-oai-collectionscanada.gc.ca-OKQ.1974-62842013-12-20T03:40:01ZIMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERSRASHTI, Mohammad JavadHigh Performance ComputingMessage PassingComputer ClustersInterconnection NetworksHigh Performance Computing (HPC) is the key to solving many scientific, financial, and engineering problems. Computer clusters are now the dominant architecture for HPC. The scale of clusters, both in terms of processor per node and the number of nodes, is increasing rapidly, reaching petascales these days and soon to exascales. Inter-process communication plays a significant role in the overall performance of HPC applications. With the continuous enhancements in interconnection technologies and node architectures, the Message Passing Interface (MPI) needs to be improved to effectively utilize the modern technologies for higher performance. After providing a background, I present a deep analysis of the user level and MPI libraries over modern cluster interconnects: InfiniBand, iWARP Ethernet, and Myrinet. Using novel techniques, I assess characteristics such as overlap and communication progress ability, buffer reuse effect on latency, and multiple-connection scalability. The outcome highlights some of the inefficiencies that exist in the communication libraries. To improve communication progress and overlap in large message transfers, a method is proposed which uses speculative communication to overlap communication with computation in the MPI Rendezvous protocol. The results show up to 100% communication progress and more than 80% overlap ability over iWARP Ethernet. An adaptation mechanism is employed to avoid overhead on applications that do not benefit from the method due to their timing specifications. To reduce MPI communication latency, I have proposed a technique that exploits the application buffer reuse characteristics for small messages and eliminates the sender-side copy in both two-sided and one-sided MPI small message transfer protocols. The implementation over InfiniBand improves small message latency up to 20%. The implementation adaptively falls back to the current method if the application does not benefit from the proposed technique. Finally, to improve scalability of MPI applications on ultra-scale clusters, I have proposed an extension to the current iWARP standard. The extension improves performance and memory usage for large-scale clusters. The extension equips Ethernet with an efficient zero-copy, connection-less datagram transport. The software-level evaluation shows more than 40% performance benefits and 30% memory usage reduction for MPI applications on a 64-core cluster.Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2010-10-16 12:25:18.388Queen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))2010-10-16 12:25:18.3882011-01-26T16:44:56Z2011-01-26T16:44:56Z2011-01-26T16:44:56ZThesishttp://hdl.handle.net/1974/6284enenCanadian thesesThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
collection NDLTD
language en
en
sources NDLTD
topic High Performance Computing
Message Passing
Computer Clusters
Interconnection Networks
spellingShingle High Performance Computing
Message Passing
Computer Clusters
Interconnection Networks
RASHTI, Mohammad Javad
IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS
description High Performance Computing (HPC) is the key to solving many scientific, financial, and engineering problems. Computer clusters are now the dominant architecture for HPC. The scale of clusters, both in terms of processor per node and the number of nodes, is increasing rapidly, reaching petascales these days and soon to exascales. Inter-process communication plays a significant role in the overall performance of HPC applications. With the continuous enhancements in interconnection technologies and node architectures, the Message Passing Interface (MPI) needs to be improved to effectively utilize the modern technologies for higher performance. After providing a background, I present a deep analysis of the user level and MPI libraries over modern cluster interconnects: InfiniBand, iWARP Ethernet, and Myrinet. Using novel techniques, I assess characteristics such as overlap and communication progress ability, buffer reuse effect on latency, and multiple-connection scalability. The outcome highlights some of the inefficiencies that exist in the communication libraries. To improve communication progress and overlap in large message transfers, a method is proposed which uses speculative communication to overlap communication with computation in the MPI Rendezvous protocol. The results show up to 100% communication progress and more than 80% overlap ability over iWARP Ethernet. An adaptation mechanism is employed to avoid overhead on applications that do not benefit from the method due to their timing specifications. To reduce MPI communication latency, I have proposed a technique that exploits the application buffer reuse characteristics for small messages and eliminates the sender-side copy in both two-sided and one-sided MPI small message transfer protocols. The implementation over InfiniBand improves small message latency up to 20%. The implementation adaptively falls back to the current method if the application does not benefit from the proposed technique. Finally, to improve scalability of MPI applications on ultra-scale clusters, I have proposed an extension to the current iWARP standard. The extension improves performance and memory usage for large-scale clusters. The extension equips Ethernet with an efficient zero-copy, connection-less datagram transport. The software-level evaluation shows more than 40% performance benefits and 30% memory usage reduction for MPI applications on a 64-core cluster. === Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2010-10-16 12:25:18.388
author2 Queen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))
author_facet Queen's University (Kingston, Ont.). Theses (Queen's University (Kingston, Ont.))
RASHTI, Mohammad Javad
author RASHTI, Mohammad Javad
author_sort RASHTI, Mohammad Javad
title IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS
title_short IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS
title_full IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS
title_fullStr IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS
title_full_unstemmed IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS
title_sort improving message-passing performance and scalability in high-performance clusters
publishDate 2010
url http://hdl.handle.net/1974/6284
work_keys_str_mv AT rashtimohammadjavad improvingmessagepassingperformanceandscalabilityinhighperformanceclusters
_version_ 1716621241426640896