The Implementation of a Real-Time Air Quality and Influenza-Like Illness Data Storage and Processing Platform

碩士 === 東海大學 === 資訊工程學系 === 105 === Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air qua...

Full description

Bibliographic Details
Main Authors: AMRAN, 陳彩進
Other Authors: YANG, CHAO-TUNG
Format: Others
Language:en_US
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/5y72n8
Description
Summary:碩士 === 東海大學 === 資訊工程學系 === 105 === Air quality becomes a main concern in the eyes of Taiwan. In recent years, this problem is always occurred in Taiwan. Therefore, the government needs various systems as a benchmark in addressing air pollution is. Besides that, In order to understand whether air quality is associated with Influenza-like Illness disease statistics, we need to build an integrate system that combines between air-pollution and Influenza-Like Illness statistically as a unity. The purpose of this study is to provide an innovative application of the research environment that concern on the performance and application of value added. For more detail, it consists of three phase designs and implementation. First, we build a cluster HDFS and Spark environment as operation, ELK Stack as a visualization platform and Ceph Object Storage as cluster backup storage. Second, using Open Data API to transfer air quality and ILI data into MySQL. It also has several problems in this study. First, database relation of this ecosystem is used for front-end and back-end big data is not relevant. Reading and writing data will have slowly speed. Therefore, we need table index to increase the speed of operation. Second, transferring data between MySQL and HDFS. Big data stored in a single file is not a good solution, because it can influence the speed of operation. Using Sqoop to split data into multiple files need to spend much time. So, we need “with direction” function to split data into multiple file with the same duration. The last one, in our study, the more data operation the more slowly speed is. So,we need Alluxio as an in-memory middle bridge storage.