MHDFS: A Memory-Based Hadoop Framework for Large Data Storage

Hadoop distributed file system (HDFS) is undoubtedly the most popular framework for storing and processing large amount of data on clusters of machines. Although a plethora of practices have been proposed for improving the processing efficiency and resource utilization, traditional HDFS still suffer...

Full description

Bibliographic Details
Main Authors: Aibo Song, Maoxian Zhao, Yingying Xue, Junzhou Luo
Format: Article
Language:English
Published: Hindawi Limited 2016-01-01
Series:Scientific Programming
Online Access:http://dx.doi.org/10.1155/2016/1808396
id doaj-8cc2578b101b4ad1ba7b14a16d6bc91e
record_format Article
spelling doaj-8cc2578b101b4ad1ba7b14a16d6bc91e2021-07-02T01:38:19ZengHindawi LimitedScientific Programming1058-92441875-919X2016-01-01201610.1155/2016/18083961808396MHDFS: A Memory-Based Hadoop Framework for Large Data StorageAibo Song0Maoxian Zhao1Yingying Xue2Junzhou Luo3School of Computer Science and Engineering, Southeast University, Nanjing 211189, ChinaCollege of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266590, ChinaSchool of Computer Science and Engineering, Southeast University, Nanjing 211189, ChinaSchool of Computer Science and Engineering, Southeast University, Nanjing 211189, ChinaHadoop distributed file system (HDFS) is undoubtedly the most popular framework for storing and processing large amount of data on clusters of machines. Although a plethora of practices have been proposed for improving the processing efficiency and resource utilization, traditional HDFS still suffers from the overhead of disk-based low throughput and I/O rate. In this paper, we attempt to address this problem by developing a memory-based Hadoop framework called MHDFS. Firstly, a strategy for allocating and configuring reasonable memory resources for MHDFS is designed and RAMFS is utilized to develop the framework. Then, we propose a new method to handle the data replacement to disk when memory resource is excessively occupied. An algorithm for estimating and updating the replacement is designed based on the metrics of file heat. Finally, substantial experiments are conducted which demonstrate the effectiveness of MHDFS and its advantage against conventional HDFS.http://dx.doi.org/10.1155/2016/1808396
collection DOAJ
language English
format Article
sources DOAJ
author Aibo Song
Maoxian Zhao
Yingying Xue
Junzhou Luo
spellingShingle Aibo Song
Maoxian Zhao
Yingying Xue
Junzhou Luo
MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
Scientific Programming
author_facet Aibo Song
Maoxian Zhao
Yingying Xue
Junzhou Luo
author_sort Aibo Song
title MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
title_short MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
title_full MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
title_fullStr MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
title_full_unstemmed MHDFS: A Memory-Based Hadoop Framework for Large Data Storage
title_sort mhdfs: a memory-based hadoop framework for large data storage
publisher Hindawi Limited
series Scientific Programming
issn 1058-9244
1875-919X
publishDate 2016-01-01
description Hadoop distributed file system (HDFS) is undoubtedly the most popular framework for storing and processing large amount of data on clusters of machines. Although a plethora of practices have been proposed for improving the processing efficiency and resource utilization, traditional HDFS still suffers from the overhead of disk-based low throughput and I/O rate. In this paper, we attempt to address this problem by developing a memory-based Hadoop framework called MHDFS. Firstly, a strategy for allocating and configuring reasonable memory resources for MHDFS is designed and RAMFS is utilized to develop the framework. Then, we propose a new method to handle the data replacement to disk when memory resource is excessively occupied. An algorithm for estimating and updating the replacement is designed based on the metrics of file heat. Finally, substantial experiments are conducted which demonstrate the effectiveness of MHDFS and its advantage against conventional HDFS.
url http://dx.doi.org/10.1155/2016/1808396
work_keys_str_mv AT aibosong mhdfsamemorybasedhadoopframeworkforlargedatastorage
AT maoxianzhao mhdfsamemorybasedhadoopframeworkforlargedatastorage
AT yingyingxue mhdfsamemorybasedhadoopframeworkforlargedatastorage
AT junzhouluo mhdfsamemorybasedhadoopframeworkforlargedatastorage
_version_ 1721344640599195648