Improvement of Computing Performance of Hadoop Map Reduce

碩士 === 健行科技大學 === 資訊管理系碩士班 === 106 === According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drive...

Full description

Bibliographic Details
Main Authors:	Chih-Hsiang Yeh, 葉志翔
Other Authors:	邱南星
Format:	Others
Language:	zh-TW
Published:	2018
Online Access:	http://ndltd.ncl.edu.tw/handle/berxkp

id	ndltd-TW-106CYU05399004
record_format	oai_dc
spelling	ndltd-TW-106CYU053990042019-11-21T05:33:59Z http://ndltd.ncl.edu.tw/handle/berxkp Improvement of Computing Performance of Hadoop Map Reduce 提升Hadoop Map Reduce運算效能之研究 Chih-Hsiang Yeh 葉志翔碩士健行科技大學資訊管理系碩士班 106 According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drives to store, To turn this data into useful information must be processed by data processing, and how to efficiently process these large data data is the motive of this study. To be able to handle data efficiently, many large information companies such as Facebook, Yahoo and Amazon are using Hadoop to process data. Hadoop is an open source code-authorized large data processing system, the main core technology for map Reduce, not only can be built in a commercial server can also be built in the general consumption of personal computers. In addition to being able to do data processing in a stand-alone way, Hadoop can also be built into a Hadoop cluster by combining multiple machines together to increase the speed of data processing, although it does improve the speed of data processing through the clustering of Hadoop systems. However, the default parameter setting in the Hadoop system only provides the function of the basic operation of the system, and does not make optimal adjustment for the different operating environments, so it can not play all the operational efficiency, so this research improves the efficiency of Hadoop in data processing effectively by adjusting the parameters. According to the results of this study, Hadoop has increased the performance of data processing by up to 2.36 times times. 邱南星 2018 學位論文 ; thesis 164 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 健行科技大學 === 資訊管理系碩士班 === 106 === According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drives to store, To turn this data into useful information must be processed by data processing, and how to efficiently process these large data data is the motive of this study. To be able to handle data efficiently, many large information companies such as Facebook, Yahoo and Amazon are using Hadoop to process data. Hadoop is an open source code-authorized large data processing system, the main core technology for map Reduce, not only can be built in a commercial server can also be built in the general consumption of personal computers. In addition to being able to do data processing in a stand-alone way, Hadoop can also be built into a Hadoop cluster by combining multiple machines together to increase the speed of data processing, although it does improve the speed of data processing through the clustering of Hadoop systems. However, the default parameter setting in the Hadoop system only provides the function of the basic operation of the system, and does not make optimal adjustment for the different operating environments, so it can not play all the operational efficiency, so this research improves the efficiency of Hadoop in data processing effectively by adjusting the parameters. According to the results of this study, Hadoop has increased the performance of data processing by up to 2.36 times times.
author2	邱南星
author_facet	邱南星 Chih-Hsiang Yeh 葉志翔
author	Chih-Hsiang Yeh 葉志翔
spellingShingle	Chih-Hsiang Yeh 葉志翔 Improvement of Computing Performance of Hadoop Map Reduce
author_sort	Chih-Hsiang Yeh
title	Improvement of Computing Performance of Hadoop Map Reduce
title_short	Improvement of Computing Performance of Hadoop Map Reduce
title_full	Improvement of Computing Performance of Hadoop Map Reduce
title_fullStr	Improvement of Computing Performance of Hadoop Map Reduce
title_full_unstemmed	Improvement of Computing Performance of Hadoop Map Reduce
title_sort	improvement of computing performance of hadoop map reduce
publishDate	2018
url	http://ndltd.ncl.edu.tw/handle/berxkp
work_keys_str_mv	AT chihhsiangyeh improvementofcomputingperformanceofhadoopmapreduce AT yèzhìxiáng improvementofcomputingperformanceofhadoopmapreduce AT chihhsiangyeh tíshēnghadoopmapreduceyùnsuànxiàonéngzhīyánjiū AT yèzhìxiáng tíshēnghadoopmapreduceyùnsuànxiàonéngzhīyánjiū
_version_	1719293608488599552

Improvement of Computing Performance of Hadoop Map Reduce

Similar Items