Dynamic monitoring, modeling and management of performance and resources for applications in cloud

Emerging trends in Cloud computing bring numerous benefits, such as higher performance, fast and flexible provisioning of applications and capacities, lower infrastructure costs, and almost unlimited scalability. However, the increasing complexity of automated performance and resource management for...

Full description

Bibliographic Details
Main Author: Xiong, Pengcheng
Published: Georgia Institute of Technology 2013
Subjects:
Online Access:http://hdl.handle.net/1853/45779
id ndltd-GATECH-oai-smartech.gatech.edu-1853-45779
record_format oai_dc
spelling ndltd-GATECH-oai-smartech.gatech.edu-1853-457792013-05-30T03:05:55ZDynamic monitoring, modeling and management of performance and resources for applications in cloudXiong, PengchengPerformance managementDatabase management systemsCloud computingResource managementControl theoryMachine learningElectronic data processing Distributed processingResource allocationInformation technologyInformation technology ManagementEmerging trends in Cloud computing bring numerous benefits, such as higher performance, fast and flexible provisioning of applications and capacities, lower infrastructure costs, and almost unlimited scalability. However, the increasing complexity of automated performance and resource management for applications in Cloud computing presents novel challenges that demand enhancement to classical control-based approaches. An important challenge that Cloud service providers often face is a resource sharing dilemma under workload variation. Cloud service providers pursue higher resource utilization, because the higher the utilization, the lower the hardware cost, operating cost and maintenance cost. On the other hand, resource utilizations cannot be too high or the service provider's revenue could be jeopardized due to the inability to meet application-level service-level objectives (SLOs). A crucial research question is how to generate as much revenue as possible by satisfying service-level agreements while reducing costs as much as possible in order to maximize the profit for Cloud service providers. To this end, the classical control-based approaches show great potential to address the resource sharing dilemma, which could be classified into three major categories, i.e., admission control, queueing and scheduling, and resource allocation. However, it is a challenging task to apply classical control-based approaches directly to computer systems, where first-principle models are generally not available. It becomes even more difficult due to the dynamics seen in real computer systems including workload variations, multi-tier dependencies, and resource bottleneck shifts. Fundamentally, the main contributions of this thesis are the efforts to enhance classical control-based approaches by leveraging other techniques to address the increasing complexity of automated performance and resource management in the Cloud through dynamic monitoring, modeling and management of performance and resources. More specifically, (1) an admission control approach is enhanced by leveraging decision theory to achieve the most profitable service-level compliance; (2) a critical resource identification approach is enhanced by leveraging statistical machine learning to automatically and adaptively identify critical resources; and (3) a resource allocation approach is enhanced by leveraging hierarchical resource management to achieve the highest resource utilization. Concretely, the enhanced control-based approaches are implemented in a collection of real control systems: ActiveSLA, vPerfGuard and ERController. The control systems are applied to different real applications, such as OLTP and OLAP database applications and distributed multi-tier web applications, with different workload intensities, type and mix, in different Cloud environments. All the experimental results show that the prototype control systems outperform existing classical control-based approaches. Finally, this thesis opens new avenues to address the increasing complexity of automated performance and resource management through enhancement of classical control-based approaches in Cloud environments. Future work will consistently follow the direction of new avenues to address the new challenges that arise with the advent of new hardware technology, new software frameworks and new computing paradigms.Georgia Institute of Technology2013-01-17T21:00:28Z2013-01-17T21:00:28Z2012-11-06Dissertationhttp://hdl.handle.net/1853/45779
collection NDLTD
sources NDLTD
topic Performance management
Database management systems
Cloud computing
Resource management
Control theory
Machine learning
Electronic data processing Distributed processing
Resource allocation
Information technology
Information technology Management
spellingShingle Performance management
Database management systems
Cloud computing
Resource management
Control theory
Machine learning
Electronic data processing Distributed processing
Resource allocation
Information technology
Information technology Management
Xiong, Pengcheng
Dynamic monitoring, modeling and management of performance and resources for applications in cloud
description Emerging trends in Cloud computing bring numerous benefits, such as higher performance, fast and flexible provisioning of applications and capacities, lower infrastructure costs, and almost unlimited scalability. However, the increasing complexity of automated performance and resource management for applications in Cloud computing presents novel challenges that demand enhancement to classical control-based approaches. An important challenge that Cloud service providers often face is a resource sharing dilemma under workload variation. Cloud service providers pursue higher resource utilization, because the higher the utilization, the lower the hardware cost, operating cost and maintenance cost. On the other hand, resource utilizations cannot be too high or the service provider's revenue could be jeopardized due to the inability to meet application-level service-level objectives (SLOs). A crucial research question is how to generate as much revenue as possible by satisfying service-level agreements while reducing costs as much as possible in order to maximize the profit for Cloud service providers. To this end, the classical control-based approaches show great potential to address the resource sharing dilemma, which could be classified into three major categories, i.e., admission control, queueing and scheduling, and resource allocation. However, it is a challenging task to apply classical control-based approaches directly to computer systems, where first-principle models are generally not available. It becomes even more difficult due to the dynamics seen in real computer systems including workload variations, multi-tier dependencies, and resource bottleneck shifts. Fundamentally, the main contributions of this thesis are the efforts to enhance classical control-based approaches by leveraging other techniques to address the increasing complexity of automated performance and resource management in the Cloud through dynamic monitoring, modeling and management of performance and resources. More specifically, (1) an admission control approach is enhanced by leveraging decision theory to achieve the most profitable service-level compliance; (2) a critical resource identification approach is enhanced by leveraging statistical machine learning to automatically and adaptively identify critical resources; and (3) a resource allocation approach is enhanced by leveraging hierarchical resource management to achieve the highest resource utilization. Concretely, the enhanced control-based approaches are implemented in a collection of real control systems: ActiveSLA, vPerfGuard and ERController. The control systems are applied to different real applications, such as OLTP and OLAP database applications and distributed multi-tier web applications, with different workload intensities, type and mix, in different Cloud environments. All the experimental results show that the prototype control systems outperform existing classical control-based approaches. Finally, this thesis opens new avenues to address the increasing complexity of automated performance and resource management through enhancement of classical control-based approaches in Cloud environments. Future work will consistently follow the direction of new avenues to address the new challenges that arise with the advent of new hardware technology, new software frameworks and new computing paradigms.
author Xiong, Pengcheng
author_facet Xiong, Pengcheng
author_sort Xiong, Pengcheng
title Dynamic monitoring, modeling and management of performance and resources for applications in cloud
title_short Dynamic monitoring, modeling and management of performance and resources for applications in cloud
title_full Dynamic monitoring, modeling and management of performance and resources for applications in cloud
title_fullStr Dynamic monitoring, modeling and management of performance and resources for applications in cloud
title_full_unstemmed Dynamic monitoring, modeling and management of performance and resources for applications in cloud
title_sort dynamic monitoring, modeling and management of performance and resources for applications in cloud
publisher Georgia Institute of Technology
publishDate 2013
url http://hdl.handle.net/1853/45779
work_keys_str_mv AT xiongpengcheng dynamicmonitoringmodelingandmanagementofperformanceandresourcesforapplicationsincloud
_version_ 1716585977393184768