Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
碩士 === 國立臺灣科技大學 === 電子工程系 === 98 === Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Redu...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2010
|
Online Access: | http://ndltd.ncl.edu.tw/handle/4tmm55 |
id |
ndltd-TW-098NTUS5428073 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-098NTUS54280732019-05-15T20:32:55Z http://ndltd.ncl.edu.tw/handle/4tmm55 Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing PerformanceStudyofMap-ReduceoverHadoopofLarge-scaleDataProcessing Yun-sun Yee 俞詠善 碩士 國立臺灣科技大學 電子工程系 98 Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. In our studies, Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set. Jenq-Shiou Leu 呂政修 2010 學位論文 ; thesis 45 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 電子工程系 === 98 === Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. In our studies, Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set.
|
author2 |
Jenq-Shiou Leu |
author_facet |
Jenq-Shiou Leu Yun-sun Yee 俞詠善 |
author |
Yun-sun Yee 俞詠善 |
spellingShingle |
Yun-sun Yee 俞詠善 Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing |
author_sort |
Yun-sun Yee |
title |
Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing |
title_short |
Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing |
title_full |
Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing |
title_fullStr |
Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing |
title_full_unstemmed |
Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing |
title_sort |
performance study of map-reduce over hadoop of large-scale data processing |
publishDate |
2010 |
url |
http://ndltd.ncl.edu.tw/handle/4tmm55 |
work_keys_str_mv |
AT yunsunyee performancestudyofmapreduceoverhadoopoflargescaledataprocessing AT yúyǒngshàn performancestudyofmapreduceoverhadoopoflargescaledataprocessing |
_version_ |
1719099937776467968 |