Identifying the Data Discrepancy Existing in Hadoop Clusters

碩士 === 輔仁大學 === 資訊工程學系碩士班 === 104 === In recent years, cloud computing is developing rapidly in the real of Internet.Among many cloud computing platforms, Hadoop is widely used because of it's stability and performance. It can easiliy handle a large number of files in a very efficient way. Had...

Full description

Bibliographic Details
Main Authors: YU TZU-TING, 游資婷
Other Authors: 葉佐任
Format: Others
Language:zh-TW
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/21290145999940618636
id ndltd-TW-104FJU00396006
record_format oai_dc
spelling ndltd-TW-104FJU003960062017-04-29T04:31:40Z http://ndltd.ncl.edu.tw/handle/21290145999940618636 Identifying the Data Discrepancy Existing in Hadoop Clusters 實現雲端運算Hadoop叢集儲存資料之差異分析 YU TZU-TING 游資婷 碩士 輔仁大學 資訊工程學系碩士班 104 In recent years, cloud computing is developing rapidly in the real of Internet.Among many cloud computing platforms, Hadoop is widely used because of it's stability and performance. It can easiliy handle a large number of files in a very efficient way. Hadoop is a distributed system, Hadoop Distributed File System(HDFS) is the default file system used in Hadoop platform. HDFS consists of a NameNode and multiple DataNodes. NameNode records the file metadata, including file location, file owner, and other related information. DataNodes are the actual places storing all the files. Each file is depleted on several DataNodes in general. However, file contents can still not be retrieved of the NameNode is lost, or all DataNodes storing those files are destroyed at the same file. To fix this problem, we can backup important files on multiple Hadoop cluster. Nevertheless errors could occur during the process of file duplication. We design and implement a scheme to identify the discrepancy between Hadoop cluster so user can fixed dismatch between files duplicated on different Hadoop Clusters. 葉佐任 2016 學位論文 ; thesis 46 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 輔仁大學 === 資訊工程學系碩士班 === 104 === In recent years, cloud computing is developing rapidly in the real of Internet.Among many cloud computing platforms, Hadoop is widely used because of it's stability and performance. It can easiliy handle a large number of files in a very efficient way. Hadoop is a distributed system, Hadoop Distributed File System(HDFS) is the default file system used in Hadoop platform. HDFS consists of a NameNode and multiple DataNodes. NameNode records the file metadata, including file location, file owner, and other related information. DataNodes are the actual places storing all the files. Each file is depleted on several DataNodes in general. However, file contents can still not be retrieved of the NameNode is lost, or all DataNodes storing those files are destroyed at the same file. To fix this problem, we can backup important files on multiple Hadoop cluster. Nevertheless errors could occur during the process of file duplication. We design and implement a scheme to identify the discrepancy between Hadoop cluster so user can fixed dismatch between files duplicated on different Hadoop Clusters.
author2 葉佐任
author_facet 葉佐任
YU TZU-TING
游資婷
author YU TZU-TING
游資婷
spellingShingle YU TZU-TING
游資婷
Identifying the Data Discrepancy Existing in Hadoop Clusters
author_sort YU TZU-TING
title Identifying the Data Discrepancy Existing in Hadoop Clusters
title_short Identifying the Data Discrepancy Existing in Hadoop Clusters
title_full Identifying the Data Discrepancy Existing in Hadoop Clusters
title_fullStr Identifying the Data Discrepancy Existing in Hadoop Clusters
title_full_unstemmed Identifying the Data Discrepancy Existing in Hadoop Clusters
title_sort identifying the data discrepancy existing in hadoop clusters
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/21290145999940618636
work_keys_str_mv AT yutzuting identifyingthedatadiscrepancyexistinginhadoopclusters
AT yóuzītíng identifyingthedatadiscrepancyexistinginhadoopclusters
AT yutzuting shíxiànyúnduānyùnsuànhadoopcóngjíchǔcúnzīliàozhīchàyìfēnxī
AT yóuzītíng shíxiànyúnduānyùnsuànhadoopcóngjíchǔcúnzīliàozhīchàyìfēnxī
_version_ 1718445389693059072