CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes

Many distributed database systems that guarantee high concurrency and scalability adopt read-write separation architecture. Simultaneously, these systems need to store massive amounts of data daily, requiring different mechanisms for storing and accessing data, such as hot and cold data access strat...

Full description

Bibliographic Details
Main Authors: Chuqiao Xiao, Yefeng Xia, Qian Zhang, Xueqing Gong, Liyan Zhu
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/10/2/126
id doaj-410307cc14af43fcb284d33ece494ecf
record_format Article
spelling doaj-410307cc14af43fcb284d33ece494ecf2021-01-09T00:05:24ZengMDPI AGElectronics2079-92922021-01-011012612610.3390/electronics10020126CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure CodesChuqiao Xiao0Yefeng Xia1Qian Zhang2Xueqing Gong3Liyan Zhu4Software Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaSoftware Engineering Institute, East China Normal University, Shanghai 200062, ChinaMany distributed database systems that guarantee high concurrency and scalability adopt read-write separation architecture. Simultaneously, these systems need to store massive amounts of data daily, requiring different mechanisms for storing and accessing data, such as hot and cold data access strategies. Unlike distributed storage systems, the distributed database splits a table into sub-tables or shards, and the request frequency of each sub-table is not the same within a specific time. Therefore, it is not only necessary to design hot-to-cold approaches to reduce storage overhead, but also cold-to-hot methods to ensure high concurrency of those systems. We present a new redundant strategy named CBase-EC, using erasure codes to trade the performances of transaction processing and storage efficiency for CBase database systems developed for financial scenarios of the Bank. Two algorithms are proposed: the hot-cold tablets (shards) recognition algorithm and the hot-cold dynamic conversion algorithm. Then we adopt two optimization approaches to improve CBase-EC performance. In the experiment, we compare CBase-EC with three-replicas in CBase. The experimental results show that although the transaction processing performance declined by no more than 6%, the storage efficiency increased by 18.4%.https://www.mdpi.com/2079-9292/10/2/126erasure codesdistributed database systemhot and cold separationstorage efficiency
collection DOAJ
language English
format Article
sources DOAJ
author Chuqiao Xiao
Yefeng Xia
Qian Zhang
Xueqing Gong
Liyan Zhu
spellingShingle Chuqiao Xiao
Yefeng Xia
Qian Zhang
Xueqing Gong
Liyan Zhu
CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
Electronics
erasure codes
distributed database system
hot and cold separation
storage efficiency
author_facet Chuqiao Xiao
Yefeng Xia
Qian Zhang
Xueqing Gong
Liyan Zhu
author_sort Chuqiao Xiao
title CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
title_short CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
title_full CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
title_fullStr CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
title_full_unstemmed CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes
title_sort cbase-ec: achieving optimal throughput-storage efficiency trade-off using erasure codes
publisher MDPI AG
series Electronics
issn 2079-9292
publishDate 2021-01-01
description Many distributed database systems that guarantee high concurrency and scalability adopt read-write separation architecture. Simultaneously, these systems need to store massive amounts of data daily, requiring different mechanisms for storing and accessing data, such as hot and cold data access strategies. Unlike distributed storage systems, the distributed database splits a table into sub-tables or shards, and the request frequency of each sub-table is not the same within a specific time. Therefore, it is not only necessary to design hot-to-cold approaches to reduce storage overhead, but also cold-to-hot methods to ensure high concurrency of those systems. We present a new redundant strategy named CBase-EC, using erasure codes to trade the performances of transaction processing and storage efficiency for CBase database systems developed for financial scenarios of the Bank. Two algorithms are proposed: the hot-cold tablets (shards) recognition algorithm and the hot-cold dynamic conversion algorithm. Then we adopt two optimization approaches to improve CBase-EC performance. In the experiment, we compare CBase-EC with three-replicas in CBase. The experimental results show that although the transaction processing performance declined by no more than 6%, the storage efficiency increased by 18.4%.
topic erasure codes
distributed database system
hot and cold separation
storage efficiency
url https://www.mdpi.com/2079-9292/10/2/126
work_keys_str_mv AT chuqiaoxiao cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes
AT yefengxia cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes
AT qianzhang cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes
AT xueqinggong cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes
AT liyanzhu cbaseecachievingoptimalthroughputstorageefficiencytradeoffusingerasurecodes
_version_ 1724344103365246976