Analyzing Open Data by R Language and Hadoop - Using Data of Weather and Agricultural Transactions

碩士 === 國立虎尾科技大學 === 資訊管理研究所 === 103 === In recent years, because of the massive potential values in “open data”, it has been become a quite popular topic in the domain of information technology. In addition, western countries and international organizations, such as United Nations endeavored to prom...

Full description

Bibliographic Details
Main Authors: Hao-Ren Wu, 吳豪仁
Other Authors: Nian-Ze Hu
Format: Others
Language:zh-TW
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/7dd75x
Description
Summary:碩士 === 國立虎尾科技大學 === 資訊管理研究所 === 103 === In recent years, because of the massive potential values in “open data”, it has been become a quite popular topic in the domain of information technology. In addition, western countries and international organizations, such as United Nations endeavored to prompt the open government data. Moreover, we obtain data from various sources, which usually do not transform the content with unique format. This would cause inconvenient to integrate and analyze the data. Therefore, it is a prominent issue to develop a mechanism which is capable of collecting and integrating the heterogeneous open dataset seamlessly and support the analysts to retrieve the potential information efficiently. Thus, this study adopts Hadoop platform and R language to implement a prototype system that can automatically capture and consolidate the open data. After the processes are finished, all results, including summarized data, analytical models, decision tree rules, and discovered key factors will be stored in relational database and HDFS. We try to collect the agriculture transactional data and historical climate records through our procedures. Additionally, this system generates the common key factors from various crops belong to a specified category by adopting proposed looping decision tree mechanism.