The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform

碩士 === 東海大學 === 資訊工程學系 === 105 === Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air qua...

Full description

Bibliographic Details
Main Authors: AMRAN, 陳彩進
Other Authors: YANG, CHAO-TUNG
Format: Others
Language:en_US
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/5y72n8
id ndltd-TW-105THU00394008
record_format oai_dc
spelling ndltd-TW-105THU003940082019-05-15T23:24:51Z http://ndltd.ncl.edu.tw/handle/5y72n8 The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform 即時空氣品質及類流感資料儲存與處理平台之研製 AMRAN 陳彩進 碩士 東海大學 資訊工程學系 105 Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air quality is associated with Influenza-like Illness disease statistics, we need to build an integrate system that combines between air-pollution and Influenza-Like Illness statistically as a unity. The purpose of this study is to provide an innovative application of the research environment that concern on the performance and application of value added. For more detail, it consists of three phase designs and implementation. First, we build a cluster HDFS and Spark environment as operation, ELK Stack as a visualization platform and Ceph Object Storage as cluster backup storage. Second, using Open Data API to transfer air quality and ILI data into MySQL. It also has several problems in this study. First, database relation of this ecosystem is used for front-end and back-end big data is not relevant. Reading and writing data will have slowly speed. Therefore, we need table index to increase the speed of operation. Second, transferring data between MySQL and HDFS. Big data stored in a single file is not a good solution, because it can influence the speed of operation. Using Sqoop to split data into multiple files need to spend much time. So, we need “with direction” function to split data into multiple file with the same duration. The last one, in our study, the more data operation the more slowly speed is. So,we need Alluxio as an in-memory middle bridge storage. YANG, CHAO-TUNG 楊朝棟 2017 學位論文 ; thesis 90 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 東海大學 === 資訊工程學系 === 105 === Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air quality is associated with Influenza-like Illness disease statistics, we need to build an integrate system that combines between air-pollution and Influenza-Like Illness statistically as a unity. The purpose of this study is to provide an innovative application of the research environment that concern on the performance and application of value added. For more detail, it consists of three phase designs and implementation. First, we build a cluster HDFS and Spark environment as operation, ELK Stack as a visualization platform and Ceph Object Storage as cluster backup storage. Second, using Open Data API to transfer air quality and ILI data into MySQL. It also has several problems in this study. First, database relation of this ecosystem is used for front-end and back-end big data is not relevant. Reading and writing data will have slowly speed. Therefore, we need table index to increase the speed of operation. Second, transferring data between MySQL and HDFS. Big data stored in a single file is not a good solution, because it can influence the speed of operation. Using Sqoop to split data into multiple files need to spend much time. So, we need “with direction” function to split data into multiple file with the same duration. The last one, in our study, the more data operation the more slowly speed is. So,we need Alluxio as an in-memory middle bridge storage.
author2 YANG, CHAO-TUNG
author_facet YANG, CHAO-TUNG
AMRAN
陳彩進
author AMRAN
陳彩進
spellingShingle AMRAN
陳彩進
The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
author_sort AMRAN
title The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
title_short The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
title_full The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
title_fullStr The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
title_full_unstemmed The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform
title_sort implementation of a real-time air quality and influenza-like illness data storage and processing platform
publishDate 2017
url http://ndltd.ncl.edu.tw/handle/5y72n8
work_keys_str_mv AT amran theimplementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform
AT chéncǎijìn theimplementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform
AT amran jíshíkōngqìpǐnzhìjílèiliúgǎnzīliàochǔcúnyǔchùlǐpíngtáizhīyánzhì
AT chéncǎijìn jíshíkōngqìpǐnzhìjílèiliúgǎnzīliàochǔcúnyǔchùlǐpíngtáizhīyánzhì
AT amran implementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform
AT chéncǎijìn implementationofarealtimeairqualityandinfluenzalikeillnessdatastorageandprocessingplatform
_version_ 1719147567730655232