Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results
碩士 === 國立中山大學 === 資訊工程學系研究所 === 99 === This thesis describes a method of reducing communication overheads within the MapReduce infrastructure of a cloud computing environment. MapReduce is an framework for parallelizing the processing on massive data systems stored across a distributed computer netw...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2011
|
Online Access: | http://ndltd.ncl.edu.tw/handle/09176936188923723605 |
id |
ndltd-TW-099NSYS5392055 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-099NSYS53920552015-10-19T04:03:19Z http://ndltd.ncl.edu.tw/handle/09176936188923723605 Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results 藉由早期部分結果結合降低通訊成本和運算代價 Jun-neng Huang 黃俊能 碩士 國立中山大學 資訊工程學系研究所 99 This thesis describes a method of reducing communication overheads within the MapReduce infrastructure of a cloud computing environment. MapReduce is an framework for parallelizing the processing on massive data systems stored across a distributed computer network. One of the benefits of MapReduce is that the computation is usually performed on a computer (node) that holds the data file. Not only does this approach achieve parallelism, but it also benefits from a characteristic common to many applications: that the answer derived from a computation is often smaller than the size of the input file. Our new method benefits also from this feature. We delay the transmission of individual answers out a given node, so as to allow these answers to be combined locally, first. This combination has two advantages. First, it allows for a further reduction in the amount of data to ultimately transmit. And second, it allows for additional computation across files (such as a merge-sort). There is a limit to the benefit of delaying transmission, however, because the reducer stage of MapReduce cannot begin its work until the nodes transmit their answers. We therefore consider a mechanism to allow the user to adjust the amount of delay before data transmission out of each node. Steve W.Haga 希家史提夫 2011 學位論文 ; thesis 46 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中山大學 === 資訊工程學系研究所 === 99 === This thesis describes a method of reducing communication overheads within the MapReduce infrastructure of a cloud computing environment. MapReduce is an framework for parallelizing the processing on massive data systems stored across a
distributed computer network. One of the benefits of MapReduce is that the computation is usually performed on a computer (node) that holds the data file. Not
only does this approach achieve parallelism, but it also benefits from a characteristic common to many applications: that the answer derived from a computation is often smaller than the size of the input file.
Our new method benefits also from this feature. We delay the transmission of individual answers out a given node, so as to allow these answers to be combined locally, first. This combination has two advantages. First, it allows for a further reduction in the amount of data to ultimately transmit. And second, it allows for additional computation across files (such as a merge-sort).
There is a limit to the benefit of delaying transmission, however, because the reducer stage of MapReduce cannot begin its work until the nodes transmit their answers. We therefore consider a mechanism to allow the user to adjust the amount of delay before data transmission out of each node.
|
author2 |
Steve W.Haga |
author_facet |
Steve W.Haga Jun-neng Huang 黃俊能 |
author |
Jun-neng Huang 黃俊能 |
spellingShingle |
Jun-neng Huang 黃俊能 Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results |
author_sort |
Jun-neng Huang |
title |
Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results |
title_short |
Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results |
title_full |
Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results |
title_fullStr |
Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results |
title_full_unstemmed |
Reducing Communication Overhead and Computation Costs in a Cloud Network by Early Combination of Partial Results |
title_sort |
reducing communication overhead and computation costs in a cloud network by early combination of partial results |
publishDate |
2011 |
url |
http://ndltd.ncl.edu.tw/handle/09176936188923723605 |
work_keys_str_mv |
AT junnenghuang reducingcommunicationoverheadandcomputationcostsinacloudnetworkbyearlycombinationofpartialresults AT huángjùnnéng reducingcommunicationoverheadandcomputationcostsinacloudnetworkbyearlycombinationofpartialresults AT junnenghuang jíyóuzǎoqībùfēnjiéguǒjiéhéjiàngdītōngxùnchéngběnhéyùnsuàndàijià AT huángjùnnéng jíyóuzǎoqībùfēnjiéguǒjiéhéjiàngdītōngxùnchéngběnhéyùnsuàndàijià |
_version_ |
1718094091684675584 |