ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning

HBase is a distributed database management system and is becoming increasingly popular for applications that need fast random access to a large amount of data. However, it has a number of performancecritical configuration parameters, which may interact with each other in a complex way, making manual...

Full description

Bibliographic Details
Main Authors: Wen Xiong, Zhengdong Bei, Chengzhong Xu, Zhibin Yu
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7950900/
id doaj-1e1ed55adf8543e988b84c4987b7d459
record_format Article
spelling doaj-1e1ed55adf8543e988b84c4987b7d4592021-03-29T20:15:31ZengIEEEIEEE Access2169-35362017-01-015131571317010.1109/ACCESS.2017.27164417950900ATH: Auto-Tuning HBase’s Configuration via Ensemble LearningWen Xiong0https://orcid.org/0000-0003-1930-0049Zhengdong Bei1https://orcid.org/0000-0001-6875-5539Chengzhong Xu2Zhibin Yu3Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaChinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaChinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaChinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaHBase is a distributed database management system and is becoming increasingly popular for applications that need fast random access to a large amount of data. However, it has a number of performancecritical configuration parameters, which may interact with each other in a complex way, making manually tuning them for optimal performance extremely difficult. In this paper, we propose a novel approach to auto-tune the configuration parameters for a given HBase application, called Auto-Tuning HBase (ATH). The key is an accurate performance model with low cost, which takes configuration parameters as inputs. To this end, we systematically explore different modeling techniques and decide to employ an ensemble learning algorithm to build the performance model. Subsequently, we leverage genetic algorithm to search the optimal configuration parameters for the application by using the performance model. As such, ATH can quickly as well as automatically identify a set of configuration parameter values to make the performance of the application optimal. We validate ATH in a cluster with ten nodes by using five typical applications from Yahoo! Cloud Serving Benchmark. The experimental results show that ATH can improve throughput by 41% on average and up to 97% compared with the default configurations. At the same time, the latency of HBase operations is reduced by 11.3% on average and up to 57%.https://ieeexplore.ieee.org/document/7950900/HBaseauto tuningperformance modelingperformance optimizationensemble learning
collection DOAJ
language English
format Article
sources DOAJ
author Wen Xiong
Zhengdong Bei
Chengzhong Xu
Zhibin Yu
spellingShingle Wen Xiong
Zhengdong Bei
Chengzhong Xu
Zhibin Yu
ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
IEEE Access
HBase
auto tuning
performance modeling
performance optimization
ensemble learning
author_facet Wen Xiong
Zhengdong Bei
Chengzhong Xu
Zhibin Yu
author_sort Wen Xiong
title ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
title_short ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
title_full ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
title_fullStr ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
title_full_unstemmed ATH: Auto-Tuning HBase’s Configuration via Ensemble Learning
title_sort ath: auto-tuning hbase’s configuration via ensemble learning
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description HBase is a distributed database management system and is becoming increasingly popular for applications that need fast random access to a large amount of data. However, it has a number of performancecritical configuration parameters, which may interact with each other in a complex way, making manually tuning them for optimal performance extremely difficult. In this paper, we propose a novel approach to auto-tune the configuration parameters for a given HBase application, called Auto-Tuning HBase (ATH). The key is an accurate performance model with low cost, which takes configuration parameters as inputs. To this end, we systematically explore different modeling techniques and decide to employ an ensemble learning algorithm to build the performance model. Subsequently, we leverage genetic algorithm to search the optimal configuration parameters for the application by using the performance model. As such, ATH can quickly as well as automatically identify a set of configuration parameter values to make the performance of the application optimal. We validate ATH in a cluster with ten nodes by using five typical applications from Yahoo! Cloud Serving Benchmark. The experimental results show that ATH can improve throughput by 41% on average and up to 97% compared with the default configurations. At the same time, the latency of HBase operations is reduced by 11.3% on average and up to 57%.
topic HBase
auto tuning
performance modeling
performance optimization
ensemble learning
url https://ieeexplore.ieee.org/document/7950900/
work_keys_str_mv AT wenxiong athautotuninghbasex2019sconfigurationviaensemblelearning
AT zhengdongbei athautotuninghbasex2019sconfigurationviaensemblelearning
AT chengzhongxu athautotuninghbasex2019sconfigurationviaensemblelearning
AT zhibinyu athautotuninghbasex2019sconfigurationviaensemblelearning
_version_ 1724194959075049472