CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
Many distributed database systems that guarantee high concurrency and scalability adopt read-write separation architecture. Simultaneously, these systems need to store massive amounts of data daily, requiring different mechanisms for storing and accessing data, such as hot and cold data access strat...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-01-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/10/2/126 |
id |
doaj-410307cc14af43fcb284d33ece494ecf |
---|---|
record_format |
Article |
spelling |
doaj-410307cc14af43fcb284d33ece494ecf2021-01-09T00:05:24ZengMDPI AGElectronics2079-92922021-01-011012612610.3390/electronics10020126CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure CodesChuqiao Xiao0Yefeng Xia1Qian Zhang2Xueqing Gong3Liyan Zhu4Software Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaMany distributed database systems that guarantee high concurrency and scalability adopt read-write separation architecture. Simultaneously, these systems need to store massive amounts of data daily, requiring different mechanisms for storing and accessing data, such as hot and cold data access strategies. Unlike distributed storage systems, the distributed database splits a table into sub-tables or shards, and the request frequency of each sub-table is not the same within a specific time. Therefore, it is not only necessary to design hot-to-cold approaches to reduce storage overhead, but also cold-to-hot methods to ensure high concurrency of those systems. We present a new redundant strategy named CBase-EC, using erasure codes to trade the performances of transaction processing and storage efficiency for CBase database systems developed for financial scenarios of the Bank. Two algorithms are proposed: the hot-cold tablets (shards) recognition algorithm and the hot-cold dynamic conversion algorithm. Then we adopt two optimization approaches to improve CBase-EC performance. In the experiment, we compare CBase-EC with three-replicas in CBase. The experimental results show that although the transaction processing performance declined by no more than 6%, the storage efficiency increased by 18.4%.https://www.mdpi.com/2079-9292/10/2/126erasure codesdistributed database systemhot and cold separationstorage efficiency |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chuqiao Xiao Yefeng Xia Qian Zhang Xueqing Gong Liyan Zhu |
spellingShingle |
Chuqiao Xiao Yefeng Xia Qian Zhang Xueqing Gong Liyan Zhu CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes Electronics erasure codes distributed database system hot and cold separation storage efficiency |
author_facet |
Chuqiao Xiao Yefeng Xia Qian Zhang Xueqing Gong Liyan Zhu |
author_sort |
Chuqiao Xiao |
title |
CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes |
title_short |
CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes |
title_full |
CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes |
title_fullStr |
CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes |
title_full_unstemmed |
CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes |
title_sort |
cbase-ec: achieving optimal throughput-storage efficiency trade-off using erasure codes |
publisher |
MDPI AG |
series |
Electronics |
issn |
2079-9292 |
publishDate |
2021-01-01 |
description |
Many distributed database systems that guarantee high concurrency and scalability adopt read-write separation architecture. Simultaneously, these systems need to store massive amounts of data daily, requiring different mechanisms for storing and accessing data, such as hot and cold data access strategies. Unlike distributed storage systems, the distributed database splits a table into sub-tables or shards, and the request frequency of each sub-table is not the same within a specific time. Therefore, it is not only necessary to design hot-to-cold approaches to reduce storage overhead, but also cold-to-hot methods to ensure high concurrency of those systems. We present a new redundant strategy named CBase-EC, using erasure codes to trade the performances of transaction processing and storage efficiency for CBase database systems developed for financial scenarios of the Bank. Two algorithms are proposed: the hot-cold tablets (shards) recognition algorithm and the hot-cold dynamic conversion algorithm. Then we adopt two optimization approaches to improve CBase-EC performance. In the experiment, we compare CBase-EC with three-replicas in CBase. The experimental results show that although the transaction processing performance declined by no more than 6%, the storage efficiency increased by 18.4%. |
topic |
erasure codes distributed database system hot and cold separation storage efficiency |
url |
https://www.mdpi.com/2079-9292/10/2/126 |
work_keys_str_mv |
AT chuqiaoxiao cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes AT yefengxia cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes AT qianzhang cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes AT xueqinggong cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes AT liyanzhu cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes |
_version_ |
1724344103365246976 |