MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
Hadoop distributed file system (HDFS) is undoubtedly the most popular framework for storing and processing large amount of data on clusters of machines. Although a plethora of practices have been proposed for improving the processing efficiency and resource utilization, traditional HDFS still suffer...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2016-01-01
|
Series: | Scientific Programming |
Online Access: | http://dx.doi.org/10.1155/2016/1808396 |
id |
doaj-8cc2578b101b4ad1ba7b14a16d6bc91e |
---|---|
record_format |
Article |
spelling |
doaj-8cc2578b101b4ad1ba7b14a16d6bc91e2021-07-02T01:38:19ZengHindawi LimitedScientific Programming1058-92441875-919X2016-01-01201610.1155/2016/18083961808396MHDFS: A Memory-Based Hadoop Framework for Large Data StorageAibo Song0Maoxian Zhao1Yingying Xue2Junzhou Luo3School of Computer Science and Engineering, Southeast University, Nanjing 211189, ChinaCollege of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266590, ChinaSchool of Computer Science and Engineering, Southeast University, Nanjing 211189, ChinaSchool of Computer Science and Engineering, Southeast University, Nanjing 211189, ChinaHadoop distributed file system (HDFS) is undoubtedly the most popular framework for storing and processing large amount of data on clusters of machines. Although a plethora of practices have been proposed for improving the processing efficiency and resource utilization, traditional HDFS still suffers from the overhead of disk-based low throughput and I/O rate. In this paper, we attempt to address this problem by developing a memory-based Hadoop framework called MHDFS. Firstly, a strategy for allocating and configuring reasonable memory resources for MHDFS is designed and RAMFS is utilized to develop the framework. Then, we propose a new method to handle the data replacement to disk when memory resource is excessively occupied. An algorithm for estimating and updating the replacement is designed based on the metrics of file heat. Finally, substantial experiments are conducted which demonstrate the effectiveness of MHDFS and its advantage against conventional HDFS.http://dx.doi.org/10.1155/2016/1808396 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Aibo Song Maoxian Zhao Yingying Xue Junzhou Luo |
spellingShingle |
Aibo Song Maoxian Zhao Yingying Xue Junzhou Luo MHDFS: A Memory-Based Hadoop Framework for Large Data Storage Scientific Programming |
author_facet |
Aibo Song Maoxian Zhao Yingying Xue Junzhou Luo |
author_sort |
Aibo Song |
title |
MHDFS: A Memory-Based Hadoop Framework for Large Data Storage |
title_short |
MHDFS: A Memory-Based Hadoop Framework for Large Data Storage |
title_full |
MHDFS: A Memory-Based Hadoop Framework for Large Data Storage |
title_fullStr |
MHDFS: A Memory-Based Hadoop Framework for Large Data Storage |
title_full_unstemmed |
MHDFS: A Memory-Based Hadoop Framework for Large Data Storage |
title_sort |
mhdfs: a memory-based hadoop framework for large data storage |
publisher |
Hindawi Limited |
series |
Scientific Programming |
issn |
1058-9244 1875-919X |
publishDate |
2016-01-01 |
description |
Hadoop distributed file system (HDFS) is undoubtedly the most popular framework for storing and processing large amount of data on clusters of machines. Although a plethora of practices have been proposed for improving the processing efficiency and resource utilization, traditional HDFS still suffers from the overhead of disk-based low throughput and I/O rate. In this paper, we attempt to address this problem by developing a memory-based Hadoop framework called MHDFS. Firstly, a strategy for allocating and configuring reasonable memory resources for MHDFS is designed and RAMFS is utilized to develop the framework. Then, we propose a new method to handle the data replacement to disk when memory resource is excessively occupied. An algorithm for estimating and updating the replacement is designed based on the metrics of file heat. Finally, substantial experiments are conducted which demonstrate the effectiveness of MHDFS and its advantage against conventional HDFS. |
url |
http://dx.doi.org/10.1155/2016/1808396 |
work_keys_str_mv |
AT aibosong mhdfsamemorybasedhadoopframeworkforlargedatastorage AT maoxianzhao mhdfsamemorybasedhadoopframeworkforlargedatastorage AT yingyingxue mhdfsamemorybasedhadoopframeworkforlargedatastorage AT junzhouluo mhdfsamemorybasedhadoopframeworkforlargedatastorage |
_version_ |
1721344640599195648 |