Summary: | 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 105 === Cloud web applications can scale resources according to dynamic workload using auto-scaling technique. In this thesis, we focus on scaling containers with multiple sizes. Our goal is to minimize the container adjustment cost and resource insufficiency while maintaining high resource utilization. We first propose a dynamic programming scaling algorithm as a baseline. This dynamic programming algorithm scales the containers optimally when given the future workload. Then, we present two greedy scaling algorithms that work without future workload information. We also propose a heuristic scaling algorithm that predicts the future resource demand using Gradient Boosting Regression. This algorithm first predicts the future workload for a short period of time, then makes its scaling decisions using the optimal dynamic programming algorithm. We conduct experiments with two realistic workload traces and compare those algorithms under different parameter settings. It is very challenging to minimize the container adjustment cost and the resource insufficiency at the same time, so we discuss the trade-off between these two goals in various situations. The experiments show that when the cost to start new servers is much more important than resource insufficiency penalty, our short-term prediction approach will increase the total cost by only 9.6%, and decrease the utilization by only 10%, when compared with the dynamic programming that knows the future workload.
|