Optimizing hardware granularity in parallel systems

In order for parallel architectures to be of significant use in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy. Whether or not this is t...

Full description

Bibliographic Details
Main Author:	Kelly, Thomas
Published:	University of Edinburgh 1995
Subjects:	004
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653270

id	ndltd-bl.uk-oai-ethos.bl.uk-653270
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-6532702016-06-21T03:22:27ZOptimizing hardware granularity in parallel systemsKelly, Thomas1995In order for parallel architectures to be of significant use in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost:performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of <I>grain</I> of hardware versus <I>speed</I> of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time <I>T</I> of an algorithm on a machine as a function of the number of nodes, <I>N.</I> The model is used to examine the degree to which it is possible to obtain an optimal value of <I>N</I>, corresponding to minimum execution time. Specifically, the optimization is investigated assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multi-processor and a message-passing multi-computer. The former is represented by a simple shared-bus multi-processor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two-dimensional mesh-connected multi-connected multi-computer. In this type of system all memory is considered private and data sharing is carried out using "messages" explicitly passed among the processing elements.004University of Edinburghhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653270http://hdl.handle.net/1842/15141Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	004
spellingShingle	004 Kelly, Thomas Optimizing hardware granularity in parallel systems
description	In order for parallel architectures to be of significant use in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy. Whether or not this is the case depends on the nature of the algorithm and on the cost:performance functions associated with the real computer hardware available at a given time. This thesis is an investigation into the tradeoff of <I>grain</I> of hardware versus <I>speed</I> of hardware, in an attempt to show how the optimal hardware parallelism can be assessed. A model is developed of the execution time <I>T</I> of an algorithm on a machine as a function of the number of nodes, <I>N.</I> The model is used to examine the degree to which it is possible to obtain an optimal value of <I>N</I>, corresponding to minimum execution time. Specifically, the optimization is investigated assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost. Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multi-processor and a message-passing multi-computer. The former is represented by a simple shared-bus multi-processor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two-dimensional mesh-connected multi-connected multi-computer. In this type of system all memory is considered private and data sharing is carried out using "messages" explicitly passed among the processing elements.
author	Kelly, Thomas
author_facet	Kelly, Thomas
author_sort	Kelly, Thomas
title	Optimizing hardware granularity in parallel systems
title_short	Optimizing hardware granularity in parallel systems
title_full	Optimizing hardware granularity in parallel systems
title_fullStr	Optimizing hardware granularity in parallel systems
title_full_unstemmed	Optimizing hardware granularity in parallel systems
title_sort	optimizing hardware granularity in parallel systems
publisher	University of Edinburgh
publishDate	1995
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.653270
work_keys_str_mv	AT kellythomas optimizinghardwaregranularityinparallelsystems
_version_	1718312572521807872

Optimizing hardware granularity in parallel systems

Similar Items