A network processor based message manager for MPI

We have implemented a system called MPI-NP II, which is an MPI specific messaging system for the Myrinet System Area Networks (SAN). It consists of a lowlevel message manager executing on the LANai processor of the Myrinet Network Interface Card (NIC), a thin host interface layer, and LAM-MPI, a...

Full description

Bibliographic Details
Main Author: Keppitiyagama, Chamath Indika
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/10681
Description
Summary:We have implemented a system called MPI-NP II, which is an MPI specific messaging system for the Myrinet System Area Networks (SAN). It consists of a lowlevel message manager executing on the LANai processor of the Myrinet Network Interface Card (NIC), a thin host interface layer, and LAM-MPI, a public domain version of MPI. MPI-NP II is a re-design of MPI-NP that simplifies and improves the performance of the original implementation. MPI-NP differs from other low-level messaging systems in that it off-loads some of the MPI specific communication tasks onto the network processor. In particular, it manages MPI message envelopes and can progress messages asynchronously from the host. It realizes three of the goals stated in the MPI standard, namely; zero-copy messaging, overlap of communication and computation, and off-loading tasks to a communication co-processor. In addition, it greatly simplifies and reduces host/NIC interaction and makes it possible to support broadcasting on the NIC. The design MPI-NP II introduces the concept of a microchannel, which is analogous to an independent thread on the NIC whose task is to deliver a specific message. The message manager allows for multiple outstanding send/receive requests and guarantees message delivery based on the available envelope resources, independent of the message size. We achieve these design goals without unduly burdening the slow network processor. MPI-NP II has a minimum message latency of 22 microseconds and a maximum bandwidth of 92MB/s. These values are comparable to other low-level messaging systems but with the added benefit of being able to overlap communication and computation.