The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
碩士 === 東海大學 === 資訊工程學系 === 105 === Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air qua...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/5y72n8 |
id |
ndltd-TW-105THU00394008 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105THU003940082019-05-15T23:24:51Z http://ndltd.ncl.edu.tw/handle/5y72n8 The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform 即時空氣品質及類流感資料儲存與處理平台之研製 AMRAN 陳彩進 碩士 東海大學 資訊工程學系 105 Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air quality is associated with Influenza-like Illness disease statistics, we need to build an integrate system that combines between air-pollution and Influenza-Like Illness statistically as a unity. The purpose of this study is to provide an innovative application of the research environment that concern on the performance and application of value added. For more detail, it consists of three phase designs and implementation. First, we build a cluster HDFS and Spark environment as operation, ELK Stack as a visualization platform and Ceph Object Storage as cluster backup storage. Second, using Open Data API to transfer air quality and ILI data into MySQL. It also has several problems in this study. First, database relation of this ecosystem is used for front-end and back-end big data is not relevant. Reading and writing data will have slowly speed. Therefore, we need table index to increase the speed of operation. Second, transferring data between MySQL and HDFS. Big data stored in a single file is not a good solution, because it can influence the speed of operation. Using Sqoop to split data into multiple files need to spend much time. So, we need “with direction” function to split data into multiple file with the same duration. The last one, in our study, the more data operation the more slowly speed is. So,we need Alluxio as an in-memory middle bridge storage. YANG, CHAO-TUNG 楊朝棟 2017 學位論文 ; thesis 90 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 東海大學 === 資訊工程學系 === 105 === Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air quality is associated with Influenza-like Illness disease statistics, we need to build an integrate system that combines between air-pollution and Influenza-Like Illness statistically as a unity. The purpose of this study is to provide an innovative application of the research environment that concern on the performance and application of value added. For more detail, it consists of three phase designs and implementation. First, we build a cluster HDFS and Spark environment as operation, ELK Stack as a visualization platform and Ceph Object Storage as cluster backup storage. Second, using Open Data API to transfer air quality and ILI data into MySQL. It also has several problems in this study. First, database relation of this ecosystem is used for front-end and back-end big data is not relevant. Reading and writing data will have slowly speed. Therefore, we need table index to increase the speed of operation. Second, transferring data between MySQL and HDFS. Big data stored in a single file is not a good solution, because it can influence the speed of operation. Using Sqoop to split data into multiple files need to spend much time. So, we need “with direction” function to split data into multiple file with the same duration. The last one, in our study, the more data operation the more slowly speed is. So,we need Alluxio as an in-memory middle bridge storage.
|
author2 |
YANG, CHAO-TUNG |
author_facet |
YANG, CHAO-TUNG AMRAN 陳彩進 |
author |
AMRAN 陳彩進 |
spellingShingle |
AMRAN 陳彩進 The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform |
author_sort |
AMRAN |
title |
The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform |
title_short |
The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform |
title_full |
The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform |
title_fullStr |
The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform |
title_full_unstemmed |
The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform |
title_sort |
implementation of a real-time air quality and influenza-like illness data storage and processing platform |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/5y72n8 |
work_keys_str_mv |
AT amran theimplementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform AT chéncǎijìn theimplementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform AT amran jíshíkōngqìpǐnzhìjílèiliúgǎnzīliàochǔcúnyǔchùlǐpíngtáizhīyánzhì AT chéncǎijìn jíshíkōngqìpǐnzhìjílèiliúgǎnzīliàochǔcúnyǔchùlǐpíngtáizhīyánzhì AT amran implementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform AT chéncǎijìn implementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform |
_version_ |
1719147567730655232 |