Correlation Aware Technique for SQL to NoSQL Transformation
碩士 === 中華大學 === 資訊工程學系碩士班 === 102 === For better efficiency of parallel and distributed computing, Apache Hadoop distributes the imported data randomly on data nodes. This mechanism provides some advantages for general data analysis, when the data have the relationship between the data sets (example...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/77732462514698260699 |
id |
ndltd-TW-102CHPI5392021 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102CHPI53920212017-02-17T16:16:41Z http://ndltd.ncl.edu.tw/handle/77732462514698260699 Correlation Aware Technique for SQL to NoSQL Transformation 關聯感知技術應用於SQL 與 NoSQL 資料轉換 Hsu, Jen-Chun 徐仁淳 碩士 中華大學 資訊工程學系碩士班 102 For better efficiency of parallel and distributed computing, Apache Hadoop distributes the imported data randomly on data nodes. This mechanism provides some advantages for general data analysis, when the data have the relationship between the data sets (example: Database), it’s a popular issue that stores the data with relevance. With the data sets increasing, a lot of data to be stored in a database, if we still use traditional database that has been unable to capable of providing an efficient service to real-time system. Most people wanted to use Hadoop to improve database performance. At this time, Apache provided a tool named Sqoop that can import all databases to Hadoop environment by command line interface. Have the same concept with Hadoop, Apache Sqoop separates each table into four parts and randomly distributes them on data nodes. However, there is still a database performance concern with this data placement mechanism. This paper proposes a Correlation Aware method on Sqoop (CA_Sqoop) to improve the data placement. By gathering related data as close as it could be to reduce the data transformation cost of the network and improve the performance in terms of database usage. The CA_Sqoop also considers the table correlation and size for better data locality and query efficiency. Simulation results show the data locality of CA_Sqoop is two times better than that of original Apache Sqoop. Hsu, Ching-Hsien 許慶賢 2014 學位論文 ; thesis 28 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 中華大學 === 資訊工程學系碩士班 === 102 === For better efficiency of parallel and distributed computing, Apache Hadoop
distributes the imported data randomly on data nodes. This mechanism provides some
advantages for general data analysis, when the data have the relationship between the
data sets (example: Database), it’s a popular issue that stores the data with relevance.
With the data sets increasing, a lot of data to be stored in a database, if we still use
traditional database that has been unable to capable of providing an efficient service to
real-time system. Most people wanted to use Hadoop to improve database
performance. At this time, Apache provided a tool named Sqoop that can import all
databases to Hadoop environment by command line interface. Have the same concept
with Hadoop, Apache Sqoop separates each table into four parts and randomly
distributes them on data nodes. However, there is still a database performance concern
with this data placement mechanism.
This paper proposes a Correlation Aware method on Sqoop (CA_Sqoop) to
improve the data placement. By gathering related data as close as it could be to reduce
the data transformation cost of the network and improve the performance in terms of
database usage. The CA_Sqoop also considers the table correlation and size for better
data locality and query efficiency. Simulation results show the data locality of
CA_Sqoop is two times better than that of original Apache Sqoop.
|
author2 |
Hsu, Ching-Hsien |
author_facet |
Hsu, Ching-Hsien Hsu, Jen-Chun 徐仁淳 |
author |
Hsu, Jen-Chun 徐仁淳 |
spellingShingle |
Hsu, Jen-Chun 徐仁淳 Correlation Aware Technique for SQL to NoSQL Transformation |
author_sort |
Hsu, Jen-Chun |
title |
Correlation Aware Technique for SQL to NoSQL Transformation |
title_short |
Correlation Aware Technique for SQL to NoSQL Transformation |
title_full |
Correlation Aware Technique for SQL to NoSQL Transformation |
title_fullStr |
Correlation Aware Technique for SQL to NoSQL Transformation |
title_full_unstemmed |
Correlation Aware Technique for SQL to NoSQL Transformation |
title_sort |
correlation aware technique for sql to nosql transformation |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/77732462514698260699 |
work_keys_str_mv |
AT hsujenchun correlationawaretechniqueforsqltonosqltransformation AT xúrénchún correlationawaretechniqueforsqltonosqltransformation AT hsujenchun guānliángǎnzhījìshùyīngyòngyúsqlyǔnosqlzīliàozhuǎnhuàn AT xúrénchún guānliángǎnzhījìshùyīngyòngyúsqlyǔnosqlzīliàozhuǎnhuàn |
_version_ |
1718414970369081344 |