Improvement of Computing Performance of Hadoop Map Reduce

碩士 === 健行科技大學 === 資訊管理系碩士班 === 106 === According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drive...

Full description

Bibliographic Details
Main Authors: Chih-Hsiang Yeh, 葉志翔
Other Authors: 邱南星
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/berxkp
id ndltd-TW-106CYU05399004
record_format oai_dc
spelling ndltd-TW-106CYU053990042019-11-21T05:33:59Z http://ndltd.ncl.edu.tw/handle/berxkp Improvement of Computing Performance of Hadoop Map Reduce 提升Hadoop Map Reduce運算效能之研究 Chih-Hsiang Yeh 葉志翔 碩士 健行科技大學 資訊管理系碩士班 106 According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drives to store, To turn this data into useful information must be processed by data processing, and how to efficiently process these large data data is the motive of this study. To be able to handle data efficiently, many large information companies such as Facebook, Yahoo and Amazon are using Hadoop to process data. Hadoop is an open source code-authorized large data processing system, the main core technology for map Reduce, not only can be built in a commercial server can also be built in the general consumption of personal computers. In addition to being able to do data processing in a stand-alone way, Hadoop can also be built into a Hadoop cluster by combining multiple machines together to increase the speed of data processing, although it does improve the speed of data processing through the clustering of Hadoop systems. However, the default parameter setting in the Hadoop system only provides the function of the basic operation of the system, and does not make optimal adjustment for the different operating environments, so it can not play all the operational efficiency, so this research improves the efficiency of Hadoop in data processing effectively by adjusting the parameters. According to the results of this study, Hadoop has increased the performance of data processing by up to 2.36 times times. 邱南星 2018 學位論文 ; thesis 164 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 健行科技大學 === 資訊管理系碩士班 === 106 === According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drives to store, To turn this data into useful information must be processed by data processing, and how to efficiently process these large data data is the motive of this study. To be able to handle data efficiently, many large information companies such as Facebook, Yahoo and Amazon are using Hadoop to process data. Hadoop is an open source code-authorized large data processing system, the main core technology for map Reduce, not only can be built in a commercial server can also be built in the general consumption of personal computers. In addition to being able to do data processing in a stand-alone way, Hadoop can also be built into a Hadoop cluster by combining multiple machines together to increase the speed of data processing, although it does improve the speed of data processing through the clustering of Hadoop systems. However, the default parameter setting in the Hadoop system only provides the function of the basic operation of the system, and does not make optimal adjustment for the different operating environments, so it can not play all the operational efficiency, so this research improves the efficiency of Hadoop in data processing effectively by adjusting the parameters. According to the results of this study, Hadoop has increased the performance of data processing by up to 2.36 times times.
author2 邱南星
author_facet 邱南星
Chih-Hsiang Yeh
葉志翔
author Chih-Hsiang Yeh
葉志翔
spellingShingle Chih-Hsiang Yeh
葉志翔
Improvement of Computing Performance of Hadoop Map Reduce
author_sort Chih-Hsiang Yeh
title Improvement of Computing Performance of Hadoop Map Reduce
title_short Improvement of Computing Performance of Hadoop Map Reduce
title_full Improvement of Computing Performance of Hadoop Map Reduce
title_fullStr Improvement of Computing Performance of Hadoop Map Reduce
title_full_unstemmed Improvement of Computing Performance of Hadoop Map Reduce
title_sort improvement of computing performance of hadoop map reduce
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/berxkp
work_keys_str_mv AT chihhsiangyeh improvementofcomputingperformanceofhadoopmapreduce
AT yèzhìxiáng improvementofcomputingperformanceofhadoopmapreduce
AT chihhsiangyeh tíshēnghadoopmapreduceyùnsuànxiàonéngzhīyánjiū
AT yèzhìxiáng tíshēnghadoopmapreduceyùnsuànxiàonéngzhīyánjiū
_version_ 1719293608488599552