Efficient reduction over threads

The increasing number of cores in both desktops and servers leads to a demand for efficient parallel algorithms. This project focuses on the fundamental collective operation reduce, which merges several arrays into one by applying a binary operation element wise. Several reduce algorithms are evalua...

Full description

Bibliographic Details
Main Author:	Falkman, Patrik
Format:	Others
Language:	English
Published:	KTH, Teoretisk fysik 2011
Subjects:	Computer and Information Sciences Data- och informationsvetenskap
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-49818

id	ndltd-UPSALLA1-oai-DiVA.org-kth-49818
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-kth-498182018-01-13T05:15:38ZEfficient reduction over threadsengFalkman, PatrikKTH, Teoretisk fysik2011Computer and Information SciencesData- och informationsvetenskapThe increasing number of cores in both desktops and servers leads to a demand for efficient parallel algorithms. This project focuses on the fundamental collective operation reduce, which merges several arrays into one by applying a binary operation element wise. Several reduce algorithms are evaluated in terms of performance and scalability and a novel algorithm is introduced that takes advantage of shared memory and exploits load imbalance. To do so, the concept of dynamic pair generation is introduced which implies constructing a binary reduce tree dynamically based on the order of thread arrival, where pairs are formed in a lock-free manner. We conclude that the dynamic algorithm, given enough spread in the arriving times, can outperform the reference algorithms for some or all array sizes. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-49818Trita-FYS, 0280-316X ; 57application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Computer and Information Sciences Data- och informationsvetenskap
spellingShingle	Computer and Information Sciences Data- och informationsvetenskap Falkman, Patrik Efficient reduction over threads
description	The increasing number of cores in both desktops and servers leads to a demand for efficient parallel algorithms. This project focuses on the fundamental collective operation reduce, which merges several arrays into one by applying a binary operation element wise. Several reduce algorithms are evaluated in terms of performance and scalability and a novel algorithm is introduced that takes advantage of shared memory and exploits load imbalance. To do so, the concept of dynamic pair generation is introduced which implies constructing a binary reduce tree dynamically based on the order of thread arrival, where pairs are formed in a lock-free manner. We conclude that the dynamic algorithm, given enough spread in the arriving times, can outperform the reference algorithms for some or all array sizes.
author	Falkman, Patrik
author_facet	Falkman, Patrik
author_sort	Falkman, Patrik
title	Efficient reduction over threads
title_short	Efficient reduction over threads
title_full	Efficient reduction over threads
title_fullStr	Efficient reduction over threads
title_full_unstemmed	Efficient reduction over threads
title_sort	efficient reduction over threads
publisher	KTH, Teoretisk fysik
publishDate	2011
url	http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-49818
work_keys_str_mv	AT falkmanpatrik efficientreductionoverthreads
_version_	1718608424526151680

Efficient reduction over threads

Similar Items