Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud

While Cloud Computing has transformed how we solve many computing tasks, some scientific and many-task applications are not efficiently executed on cloud resources. Decentralized scheduling, as studied in grid computing, can provide a scalable system to organize cloud resources and schedule a variet...

Full description

Bibliographic Details
Main Author: Peterson, Brian Lyle
Other Authors: Baumgartner, Gerald
Format: Others
Language:en
Published: LSU 2017
Subjects:
Online Access:http://etd.lsu.edu/docs/available/etd-04082017-154817/
id ndltd-LSU-oai-etd.lsu.edu-etd-04082017-154817
record_format oai_dc
spelling ndltd-LSU-oai-etd.lsu.edu-etd-04082017-1548172017-05-03T04:18:30Z Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud Peterson, Brian Lyle Computer Science While Cloud Computing has transformed how we solve many computing tasks, some scientific and many-task applications are not efficiently executed on cloud resources. Decentralized scheduling, as studied in grid computing, can provide a scalable system to organize cloud resources and schedule a variety of work. By measuring simulations of two algorithms, the fully decentralized Organic Grid, and the partially decentralized Air Traffic Controller from IBM, we establish that decentralization is a workable approach, and that there are bottlenecks that can impact partially centralized algorithms. Through measurements in the cloud, we verify that our simulation approach is sound, and assess the variable performance of cloud resources. We propose a scheduler that measures the capabilities of the resources available to execute a task and distributes work dynamically at run time. Our scheduling algorithm is evaluated experimentally, and we show that performance-aware scheduling in a cloud environment can provide improvements in execution time. This provides a framework by which a variety of parameters can be weighed to make job-specific and context-aware scheduling decisions. Our measurements examine the usefulness of benchmarking as a metric used to measure a node's performance, and drive scheduling. Benchmarking provides an advantage over simple queue-based scheduling on distributed systems whose members vary in actual performance, but the NAS benchmark we use does not always correlate perfectly with actual performance. The utilized hardware is examined, as are enforced performance variations, and we observe changes in performance that result in running on a system in which different workers receive different CPU allocations. As we see that performance metrics are useful near the end of the execution of a large job, we create a new metric from historical data of partially completed work, and use that to drive execution time down further. Interdependent task graph work is introduced and described as a next step in improving cloud scheduling. Realistic task graph problems are defined and a scheduling approach is introduced. This dissertation lays the groundwork to expand the types of problems that can be solved efficiently in the cloud environment. Baumgartner, Gerald Ramanujam, J Wang, Qingyang Wang, Wanjun LSU 2017-05-02 text application/pdf http://etd.lsu.edu/docs/available/etd-04082017-154817/ http://etd.lsu.edu/docs/available/etd-04082017-154817/ en unrestricted I hereby certify that, if appropriate, I have obtained and attached herein a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to LSU or its agents the non-exclusive license to archive and make accessible, under the conditions specified below and in appropriate University policies, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.
collection NDLTD
language en
format Others
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Peterson, Brian Lyle
Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud
description While Cloud Computing has transformed how we solve many computing tasks, some scientific and many-task applications are not efficiently executed on cloud resources. Decentralized scheduling, as studied in grid computing, can provide a scalable system to organize cloud resources and schedule a variety of work. By measuring simulations of two algorithms, the fully decentralized Organic Grid, and the partially decentralized Air Traffic Controller from IBM, we establish that decentralization is a workable approach, and that there are bottlenecks that can impact partially centralized algorithms. Through measurements in the cloud, we verify that our simulation approach is sound, and assess the variable performance of cloud resources. We propose a scheduler that measures the capabilities of the resources available to execute a task and distributes work dynamically at run time. Our scheduling algorithm is evaluated experimentally, and we show that performance-aware scheduling in a cloud environment can provide improvements in execution time. This provides a framework by which a variety of parameters can be weighed to make job-specific and context-aware scheduling decisions. Our measurements examine the usefulness of benchmarking as a metric used to measure a node's performance, and drive scheduling. Benchmarking provides an advantage over simple queue-based scheduling on distributed systems whose members vary in actual performance, but the NAS benchmark we use does not always correlate perfectly with actual performance. The utilized hardware is examined, as are enforced performance variations, and we observe changes in performance that result in running on a system in which different workers receive different CPU allocations. As we see that performance metrics are useful near the end of the execution of a large job, we create a new metric from historical data of partially completed work, and use that to drive execution time down further. Interdependent task graph work is introduced and described as a next step in improving cloud scheduling. Realistic task graph problems are defined and a scheduling approach is introduced. This dissertation lays the groundwork to expand the types of problems that can be solved efficiently in the cloud environment.
author2 Baumgartner, Gerald
author_facet Baumgartner, Gerald
Peterson, Brian Lyle
author Peterson, Brian Lyle
author_sort Peterson, Brian Lyle
title Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud
title_short Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud
title_full Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud
title_fullStr Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud
title_full_unstemmed Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud
title_sort decentralized scheduling for many-task applications in the hybrid cloud
publisher LSU
publishDate 2017
url http://etd.lsu.edu/docs/available/etd-04082017-154817/
work_keys_str_mv AT petersonbrianlyle decentralizedschedulingformanytaskapplicationsinthehybridcloud
_version_ 1718445473551876096