Improvement of Computing Performance of Hadoop Map Reduce
碩士 === 健行科技大學 === 資訊管理系碩士班 === 106 === According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drive...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2018
|
Online Access: | http://ndltd.ncl.edu.tw/handle/berxkp |
id |
ndltd-TW-106CYU05399004 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106CYU053990042019-11-21T05:33:59Z http://ndltd.ncl.edu.tw/handle/berxkp Improvement of Computing Performance of Hadoop Map Reduce 提升Hadoop Map Reduce運算效能之研究 Chih-Hsiang Yeh 葉志翔 碩士 健行科技大學 資訊管理系碩士班 106 According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drives to store, To turn this data into useful information must be processed by data processing, and how to efficiently process these large data data is the motive of this study. To be able to handle data efficiently, many large information companies such as Facebook, Yahoo and Amazon are using Hadoop to process data. Hadoop is an open source code-authorized large data processing system, the main core technology for map Reduce, not only can be built in a commercial server can also be built in the general consumption of personal computers. In addition to being able to do data processing in a stand-alone way, Hadoop can also be built into a Hadoop cluster by combining multiple machines together to increase the speed of data processing, although it does improve the speed of data processing through the clustering of Hadoop systems. However, the default parameter setting in the Hadoop system only provides the function of the basic operation of the system, and does not make optimal adjustment for the different operating environments, so it can not play all the operational efficiency, so this research improves the efficiency of Hadoop in data processing effectively by adjusting the parameters. According to the results of this study, Hadoop has increased the performance of data processing by up to 2.36 times times. 邱南星 2018 學位論文 ; thesis 164 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 健行科技大學 === 資訊管理系碩士班 === 106 === According to the 2012 International data and information Company''s survey, the data volume of digital data will grow to around 40ZB in the year 2020, equivalent to approximately 5000GB of data per person around the world, at least 5 1TB of hard drives to store, To turn this data into useful information must be processed by data processing, and how to efficiently process these large data data is the motive of this study. To be able to handle data efficiently, many large information companies such as Facebook, Yahoo and Amazon are using Hadoop to process data. Hadoop is an open source code-authorized large data processing system, the main core technology for map Reduce, not only can be built in a commercial server can also be built in the general consumption of personal computers. In addition to being able to do data processing in a stand-alone way, Hadoop can also be built into a Hadoop cluster by combining multiple machines together to increase the speed of data processing, although it does improve the speed of data processing through the clustering of Hadoop systems. However, the default parameter setting in the Hadoop system only provides the function of the basic operation of the system, and does not make optimal adjustment for the different operating environments, so it can not play all the operational efficiency, so this research improves the efficiency of Hadoop in data processing effectively by adjusting the parameters. According to the results of this study, Hadoop has increased the performance of data processing by up to 2.36 times times.
|
author2 |
邱南星 |
author_facet |
邱南星 Chih-Hsiang Yeh 葉志翔 |
author |
Chih-Hsiang Yeh 葉志翔 |
spellingShingle |
Chih-Hsiang Yeh 葉志翔 Improvement of Computing Performance of Hadoop Map Reduce |
author_sort |
Chih-Hsiang Yeh |
title |
Improvement of Computing Performance of Hadoop Map Reduce |
title_short |
Improvement of Computing Performance of Hadoop Map Reduce |
title_full |
Improvement of Computing Performance of Hadoop Map Reduce |
title_fullStr |
Improvement of Computing Performance of Hadoop Map Reduce |
title_full_unstemmed |
Improvement of Computing Performance of Hadoop Map Reduce |
title_sort |
improvement of computing performance of hadoop map reduce |
publishDate |
2018 |
url |
http://ndltd.ncl.edu.tw/handle/berxkp |
work_keys_str_mv |
AT chihhsiangyeh improvementofcomputingperformanceofhadoopmapreduce AT yèzhìxiáng improvementofcomputingperformanceofhadoopmapreduce AT chihhsiangyeh tíshēnghadoopmapreduceyùnsuànxiàonéngzhīyánjiū AT yèzhìxiáng tíshēnghadoopmapreduceyùnsuànxiàonéngzhīyánjiū |
_version_ |
1719293608488599552 |