Optimizing hardware granularity in parallel systems

In order for parallel architectures to be of significant use in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy. Whether or not this is t...

Full description

Bibliographic Details
Main Author: Kelly, Thomas
Published: University of Edinburgh 1995
Subjects:
004
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653270
id ndltd-bl.uk-oai-ethos.bl.uk-653270
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-6532702016-06-21T03:22:27ZOptimizing hardware granularity in parallel systemsKelly, Thomas1995In order for parallel architectures to be of significant use in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost:performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of <I>grain</I> of hardware versus <I>speed</I> of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time <I>T</I> of an algorithm on a machine as a function of the number of nodes, <I>N.</I> The model is used to examine the degree to which it is possible to obtain an optimal value of <I>N</I>, corresponding to minimum execution time. Specifically, the optimization is investigated assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multi-processor and a message-passing multi-computer. The former is represented by a simple shared-bus multi-processor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two-dimensional mesh-connected multi-connected multi-computer. In this type of system all memory is considered private and data sharing is carried out using "messages" explicitly passed among the processing elements.004University of Edinburghhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653270http://hdl.handle.net/1842/15141Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Kelly, Thomas
Optimizing hardware granularity in parallel systems
description In order for parallel architectures to be of significant use in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost:performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of <I>grain</I> of hardware versus <I>speed</I> of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time <I>T</I> of an algorithm on a machine as a function of the number of nodes, <I>N.</I> The model is used to examine the degree to which it is possible to obtain an optimal value of <I>N</I>, corresponding to minimum execution time. Specifically, the optimization is investigated assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multi-processor and a message-passing multi-computer. The former is represented by a simple shared-bus multi-processor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two-dimensional mesh-connected multi-connected multi-computer. In this type of system all memory is considered private and data sharing is carried out using "messages" explicitly passed among the processing elements.
author Kelly, Thomas
author_facet Kelly, Thomas
author_sort Kelly, Thomas
title Optimizing hardware granularity in parallel systems
title_short Optimizing hardware granularity in parallel systems
title_full Optimizing hardware granularity in parallel systems
title_fullStr Optimizing hardware granularity in parallel systems
title_full_unstemmed Optimizing hardware granularity in parallel systems
title_sort optimizing hardware granularity in parallel systems
publisher University of Edinburgh
publishDate 1995
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653270
work_keys_str_mv AT kellythomas optimizinghardwaregranularityinparallelsystems
_version_ 1718312572521807872