ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
HBase is a distributed database management system and is becoming increasingly popular for applications that need fast random access to a large amount of data. However, it has a number of performancecritical configuration parameters, which may interact with each other in a complex way, making manual...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2017-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/7950900/ |
id |
doaj-1e1ed55adf8543e988b84c4987b7d459 |
---|---|
record_format |
Article |
spelling |
doaj-1e1ed55adf8543e988b84c4987b7d4592021-03-29T20:15:31ZengIEEEIEEE Access2169-35362017-01-015131571317010.1109/ACCESS.2017.27164417950900ATH: Auto-Tuning HBase’s Configuration via Ensemble LearningWen Xiong0https://orcid.org/0000-0003-1930-0049Zhengdong Bei1https://orcid.org/0000-0001-6875-5539Chengzhong Xu2Zhibin Yu3Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaChinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaChinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaChinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaHBase is a distributed database management system and is becoming increasingly popular for applications that need fast random access to a large amount of data. However, it has a number of performancecritical configuration parameters, which may interact with each other in a complex way, making manually tuning them for optimal performance extremely difficult. In this paper, we propose a novel approach to auto-tune the configuration parameters for a given HBase application, called Auto-Tuning HBase (ATH). The key is an accurate performance model with low cost, which takes configuration parameters as inputs. To this end, we systematically explore different modeling techniques and decide to employ an ensemble learning algorithm to build the performance model. Subsequently, we leverage genetic algorithm to search the optimal configuration parameters for the application by using the performance model. As such, ATH can quickly as well as automatically identify a set of configuration parameter values to make the performance of the application optimal. We validate ATH in a cluster with ten nodes by using five typical applications from Yahoo! Cloud Serving Benchmark. The experimental results show that ATH can improve throughput by 41% on average and up to 97% compared with the default configurations. At the same time, the latency of HBase operations is reduced by 11.3% on average and up to 57%.https://ieeexplore.ieee.org/document/7950900/HBaseauto tuningperformance modelingperformance optimizationensemble learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wen Xiong Zhengdong Bei Chengzhong Xu Zhibin Yu |
spellingShingle |
Wen Xiong Zhengdong Bei Chengzhong Xu Zhibin Yu ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning IEEE Access HBase auto tuning performance modeling performance optimization ensemble learning |
author_facet |
Wen Xiong Zhengdong Bei Chengzhong Xu Zhibin Yu |
author_sort |
Wen Xiong |
title |
ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning |
title_short |
ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning |
title_full |
ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning |
title_fullStr |
ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning |
title_full_unstemmed |
ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning |
title_sort |
ath: auto-tuning hbase’s configuration via ensemble learning |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2017-01-01 |
description |
HBase is a distributed database management system and is becoming increasingly popular for applications that need fast random access to a large amount of data. However, it has a number of performancecritical configuration parameters, which may interact with each other in a complex way, making manually tuning them for optimal performance extremely difficult. In this paper, we propose a novel approach to auto-tune the configuration parameters for a given HBase application, called Auto-Tuning HBase (ATH). The key is an accurate performance model with low cost, which takes configuration parameters as inputs. To this end, we systematically explore different modeling techniques and decide to employ an ensemble learning algorithm to build the performance model. Subsequently, we leverage genetic algorithm to search the optimal configuration parameters for the application by using the performance model. As such, ATH can quickly as well as automatically identify a set of configuration parameter values to make the performance of the application optimal. We validate ATH in a cluster with ten nodes by using five typical applications from Yahoo! Cloud Serving Benchmark. The experimental results show that ATH can improve throughput by 41% on average and up to 97% compared with the default configurations. At the same time, the latency of HBase operations is reduced by 11.3% on average and up to 57%. |
topic |
HBase auto tuning performance modeling performance optimization ensemble learning |
url |
https://ieeexplore.ieee.org/document/7950900/ |
work_keys_str_mv |
AT wenxiong athautotuninghbasex2019sconfigurationviaensemblelearning AT zhengdongbei athautotuninghbasex2019sconfigurationviaensemblelearning AT chengzhongxu athautotuninghbasex2019sconfigurationviaensemblelearning AT zhibinyu athautotuninghbasex2019sconfigurationviaensemblelearning |
_version_ |
1724194959075049472 |