Identifying the Data Discrepancy Existing in Hadoop Clusters
碩士 === 輔仁大學 === 資訊工程學系碩士班 === 104 === In recent years, cloud computing is developing rapidly in the real of Internet.Among many cloud computing platforms, Hadoop is widely used because of it's stability and performance. It can easiliy handle a large number of files in a very efficient way. Had...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2016
|
Online Access: | http://ndltd.ncl.edu.tw/handle/21290145999940618636 |
id |
ndltd-TW-104FJU00396006 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104FJU003960062017-04-29T04:31:40Z http://ndltd.ncl.edu.tw/handle/21290145999940618636 Identifying the Data Discrepancy Existing in Hadoop Clusters 實現雲端運算Hadoop叢集儲存資料之差異分析 YU TZU-TING 游資婷 碩士 輔仁大學 資訊工程學系碩士班 104 In recent years, cloud computing is developing rapidly in the real of Internet.Among many cloud computing platforms, Hadoop is widely used because of it's stability and performance. It can easiliy handle a large number of files in a very efficient way. Hadoop is a distributed system, Hadoop Distributed File System(HDFS) is the default file system used in Hadoop platform. HDFS consists of a NameNode and multiple DataNodes. NameNode records the file metadata, including file location, file owner, and other related information. DataNodes are the actual places storing all the files. Each file is depleted on several DataNodes in general. However, file contents can still not be retrieved of the NameNode is lost, or all DataNodes storing those files are destroyed at the same file. To fix this problem, we can backup important files on multiple Hadoop cluster. Nevertheless errors could occur during the process of file duplication. We design and implement a scheme to identify the discrepancy between Hadoop cluster so user can fixed dismatch between files duplicated on different Hadoop Clusters. 葉佐任 2016 學位論文 ; thesis 46 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 輔仁大學 === 資訊工程學系碩士班 === 104 === In recent years, cloud computing is developing rapidly in the real of Internet.Among
many cloud computing platforms, Hadoop is widely used because of it's stability and performance. It can easiliy handle a large number of files in a very efficient way.
Hadoop is a distributed system, Hadoop Distributed File System(HDFS) is the default
file system used in Hadoop platform. HDFS consists of a NameNode and multiple DataNodes. NameNode records the file metadata, including file location, file owner, and other related information. DataNodes are the actual places storing all the files. Each file is depleted on several DataNodes in general. However, file contents can still not be retrieved of the NameNode is lost, or all DataNodes storing those files are destroyed at the same file. To fix this problem, we can backup important files on multiple Hadoop cluster. Nevertheless errors could occur during the process of file duplication.
We design and implement a scheme to identify the discrepancy between Hadoop cluster
so user can fixed dismatch between files duplicated on different Hadoop Clusters.
|
author2 |
葉佐任 |
author_facet |
葉佐任 YU TZU-TING 游資婷 |
author |
YU TZU-TING 游資婷 |
spellingShingle |
YU TZU-TING 游資婷 Identifying the Data Discrepancy Existing in Hadoop Clusters |
author_sort |
YU TZU-TING |
title |
Identifying the Data Discrepancy Existing in Hadoop Clusters |
title_short |
Identifying the Data Discrepancy Existing in Hadoop Clusters |
title_full |
Identifying the Data Discrepancy Existing in Hadoop Clusters |
title_fullStr |
Identifying the Data Discrepancy Existing in Hadoop Clusters |
title_full_unstemmed |
Identifying the Data Discrepancy Existing in Hadoop Clusters |
title_sort |
identifying the data discrepancy existing in hadoop clusters |
publishDate |
2016 |
url |
http://ndltd.ncl.edu.tw/handle/21290145999940618636 |
work_keys_str_mv |
AT yutzuting identifyingthedatadiscrepancyexistinginhadoopclusters AT yóuzītíng identifyingthedatadiscrepancyexistinginhadoopclusters AT yutzuting shíxiànyúnduānyùnsuànhadoopcóngjíchǔcúnzīliàozhīchàyìfēnxī AT yóuzītíng shíxiànyúnduānyùnsuànhadoopcóngjíchǔcúnzīliàozhīchàyìfēnxī |
_version_ |
1718445389693059072 |