A Scheduling for Multiple Small Files Processing in Inter-Cloud System

碩士 === 國立彰化師範大學 === 資訊工程學系 === 103 === The world of big data has arrived. In just last years, we’ve produced more data than in all of human history. To obtain information is more easier than before. High-Value data in the massive data is usually contained less than 2% of the data set. Many profe...

Full description

Bibliographic Details
Main Authors: Chun-Han Tai, 戴君翰
Other Authors: Ming-Yi Shih
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/57732734363811863809
id ndltd-TW-103NCUE5392026
record_format oai_dc
spelling ndltd-TW-103NCUE53920262016-08-14T04:11:20Z http://ndltd.ncl.edu.tw/handle/57732734363811863809 A Scheduling for Multiple Small Files Processing in Inter-Cloud System 一個在互聯雲系統架構下處理多組小型檔案的調度策略 Chun-Han Tai 戴君翰 碩士 國立彰化師範大學 資訊工程學系 103 The world of big data has arrived. In just last years, we’ve produced more data than in all of human history. To obtain information is more easier than before. High-Value data in the massive data is usually contained less than 2% of the data set. Many professionals research on how to process data set in data mining algorithms on Hadoop platform. However, as the number of resource consumers of cloud service is increasing significantly, it becomes apparent that the capacity-oriented clouds require coming together. Inter-clouds, which an architecture of combined multiple cloud service cluster , is a great solution for this problem. It is need to dispatch remote cloud resource when the computing capacity of local cluster becomes saturated. Inter-clouds architecture was a revolution in the speed of data processing and solved Hadoop Namenode fault problem. Many effectiveness research topics Hadoop platform are also being performed, such as task scheduling, parameter optimization, file system improvements and other issues. However, those research are running on a single cloud. In this paper, we present a coordinator which can connect two Hadoop clusters and schedule jobs due to the different computing capacity and system resources between two clusters. Experimenting on processing a random data set, generated by using data generator ,based on FP-Growth algorithm which cost a great deal of time and evaluating meta scheduler for inter-clouds. Ming-Yi Shih 施明毅 學位論文 ; thesis 50 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立彰化師範大學 === 資訊工程學系 === 103 === The world of big data has arrived. In just last years, we’ve produced more data than in all of human history. To obtain information is more easier than before. High-Value data in the massive data is usually contained less than 2% of the data set. Many professionals research on how to process data set in data mining algorithms on Hadoop platform. However, as the number of resource consumers of cloud service is increasing significantly, it becomes apparent that the capacity-oriented clouds require coming together. Inter-clouds, which an architecture of combined multiple cloud service cluster , is a great solution for this problem. It is need to dispatch remote cloud resource when the computing capacity of local cluster becomes saturated. Inter-clouds architecture was a revolution in the speed of data processing and solved Hadoop Namenode fault problem. Many effectiveness research topics Hadoop platform are also being performed, such as task scheduling, parameter optimization, file system improvements and other issues. However, those research are running on a single cloud. In this paper, we present a coordinator which can connect two Hadoop clusters and schedule jobs due to the different computing capacity and system resources between two clusters. Experimenting on processing a random data set, generated by using data generator ,based on FP-Growth algorithm which cost a great deal of time and evaluating meta scheduler for inter-clouds.
author2 Ming-Yi Shih
author_facet Ming-Yi Shih
Chun-Han Tai
戴君翰
author Chun-Han Tai
戴君翰
spellingShingle Chun-Han Tai
戴君翰
A Scheduling for Multiple Small Files Processing in Inter-Cloud System
author_sort Chun-Han Tai
title A Scheduling for Multiple Small Files Processing in Inter-Cloud System
title_short A Scheduling for Multiple Small Files Processing in Inter-Cloud System
title_full A Scheduling for Multiple Small Files Processing in Inter-Cloud System
title_fullStr A Scheduling for Multiple Small Files Processing in Inter-Cloud System
title_full_unstemmed A Scheduling for Multiple Small Files Processing in Inter-Cloud System
title_sort scheduling for multiple small files processing in inter-cloud system
url http://ndltd.ncl.edu.tw/handle/57732734363811863809
work_keys_str_mv AT chunhantai aschedulingformultiplesmallfilesprocessinginintercloudsystem
AT dàijūnhàn aschedulingformultiplesmallfilesprocessinginintercloudsystem
AT chunhantai yīgèzàihùliányúnxìtǒngjiàgòuxiàchùlǐduōzǔxiǎoxíngdàngàndediàodùcèlüè
AT dàijūnhàn yīgèzàihùliányúnxìtǒngjiàgòuxiàchùlǐduōzǔxiǎoxíngdàngàndediàodùcèlüè
AT chunhantai schedulingformultiplesmallfilesprocessinginintercloudsystem
AT dàijūnhàn schedulingformultiplesmallfilesprocessinginintercloudsystem
_version_ 1718375856337846272