Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing

碩士 === 國立臺灣科技大學 === 電子工程系 === 98 === Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Redu...

Full description

Bibliographic Details
Main Authors:	Yun-sun Yee, 俞詠善
Other Authors:	Jenq-Shiou Leu
Format:	Others
Language:	en_US
Published:	2010
Online Access:	http://ndltd.ncl.edu.tw/handle/4tmm55

id	ndltd-TW-098NTUS5428073
record_format	oai_dc
spelling	ndltd-TW-098NTUS54280732019-05-15T20:32:55Z http://ndltd.ncl.edu.tw/handle/4tmm55 Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing PerformanceStudyofMap-ReduceoverHadoopofLarge-scaleDataProcessing Yun-sun Yee 俞詠善碩士國立臺灣科技大學電子工程系 98 Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. In our studies, Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set. Jenq-Shiou Leu 呂政修 2010 學位論文 ; thesis 45 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣科技大學 === 電子工程系 === 98 === Popularity for the term ‘Cloud-Computing’ has been increasing in recent years. There are many great companies such as Yahoo, Google etc. tried to provide related services to business community, even through public users. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. In this paper, we focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. In our studies, Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. We also find that more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set.
author2	Jenq-Shiou Leu
author_facet	Jenq-Shiou Leu Yun-sun Yee 俞詠善
author	Yun-sun Yee 俞詠善
spellingShingle	Yun-sun Yee 俞詠善 Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
author_sort	Yun-sun Yee
title	Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_short	Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_full	Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_fullStr	Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_full_unstemmed	Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing
title_sort	performance study of map-reduce over hadoop of large-scale data processing
publishDate	2010
url	http://ndltd.ncl.edu.tw/handle/4tmm55
work_keys_str_mv	AT yunsunyee performancestudyofmapreduceoverhadoopoflargescaledataprocessing AT yúyǒngshàn performancestudyofmapreduceoverhadoopoflargescaledataprocessing
_version_	1719099937776467968

Performance Study of Map-Reduce over Hadoop of Large-scale Data Processing

Similar Items