Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing

碩士 === 國立臺灣科技大學 === 電子工程系 === 98 === Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Redu...

Full description

Bibliographic Details
Main Authors: Yun-sun Yee, 俞詠善
Other Authors: Jenq-Shiou Leu
Format: Others
Language:en_US
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/4tmm55
id ndltd-TW-098NTUS5428073
record_format oai_dc
spelling ndltd-TW-098NTUS54280732019-05-15T20:32:55Z http://ndltd.ncl.edu.tw/handle/4tmm55 Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing PerformanceStudyofMap-ReduceoverHadoopofLarge-scaleDataProcessing Yun-sun Yee 俞詠善 碩士 國立臺灣科技大學 電子工程系 98 Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. In our studies, Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set. Jenq-Shiou Leu 呂政修 2010 學位論文 ; thesis 45 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 電子工程系 === 98 === Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. In our studies, Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set.
author2 Jenq-Shiou Leu
author_facet Jenq-Shiou Leu
Yun-sun Yee
俞詠善
author Yun-sun Yee
俞詠善
spellingShingle Yun-sun Yee
俞詠善
Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
author_sort Yun-sun Yee
title Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_short Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_full Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_fullStr Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_full_unstemmed Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_sort performance study of map-reduce over hadoop of large-scale data processing
publishDate 2010
url http://ndltd.ncl.edu.tw/handle/4tmm55
work_keys_str_mv AT yunsunyee performancestudyofmapreduceoverhadoopoflargescaledataprocessing
AT yúyǒngshàn performancestudyofmapreduceoverhadoopoflargescaledataprocessing
_version_ 1719099937776467968