Study of Machine Learning Algorithms on Spark

碩士 === 國立彰化師範大學 === 資訊管理學系所 === 104 === In 2015, machine learning has been selected by Gartner as one of the top ten strategic technologies of the future in the Technology of the Year Seminar (Gartner Symposium / ITxpo) held in Orlando. In addition, IoT (Internet of Things) and machine learning are...

Full description

Bibliographic Details
Main Authors: KO,YU-HUNG, 柯仲璟
Other Authors: 吳東光
Format: Others
Language:zh-TW
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/76239070110682716614
id ndltd-TW-104NCUE5396017
record_format oai_dc
spelling ndltd-TW-104NCUE53960172017-09-03T04:25:30Z http://ndltd.ncl.edu.tw/handle/76239070110682716614 Study of Machine Learning Algorithms on Spark 植基於Spark之機器學習演算法研究 KO,YU-HUNG 柯仲璟 碩士 國立彰化師範大學 資訊管理學系所 104 In 2015, machine learning has been selected by Gartner as one of the top ten strategic technologies of the future in the Technology of the Year Seminar (Gartner Symposium / ITxpo) held in Orlando. In addition, IoT (Internet of Things) and machine learning are also considered the essential next-generation technologies by the ICT (Information and Communication Technology) industries, which both based on the big data analysis techniques. It is believed that whoever holds the edge in big data may lead in the industry. Accordingly, the potential of big data has attracted major international companies into this market. In this thesis, we investigate the Apache Spark, which provides fast processing of various machine learning algorithms through its In Memory technology and cluster computing framework. In particular, performance of two algorithms, decision tree and support vector machine, will be evaluated in the contexts of multi-threading and multi-host environments. Big data re-generated through date set collected for the diagnosis of students with learning disabilities will be used as the test samples for the evaluation. Keywords: Machine learning, Big Data, Apache Spark, decision tree, Support Vector Machine. 吳東光 2016 學位論文 ; thesis 50 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立彰化師範大學 === 資訊管理學系所 === 104 === In 2015, machine learning has been selected by Gartner as one of the top ten strategic technologies of the future in the Technology of the Year Seminar (Gartner Symposium / ITxpo) held in Orlando. In addition, IoT (Internet of Things) and machine learning are also considered the essential next-generation technologies by the ICT (Information and Communication Technology) industries, which both based on the big data analysis techniques. It is believed that whoever holds the edge in big data may lead in the industry. Accordingly, the potential of big data has attracted major international companies into this market. In this thesis, we investigate the Apache Spark, which provides fast processing of various machine learning algorithms through its In Memory technology and cluster computing framework. In particular, performance of two algorithms, decision tree and support vector machine, will be evaluated in the contexts of multi-threading and multi-host environments. Big data re-generated through date set collected for the diagnosis of students with learning disabilities will be used as the test samples for the evaluation. Keywords: Machine learning, Big Data, Apache Spark, decision tree, Support Vector Machine.
author2 吳東光
author_facet 吳東光
KO,YU-HUNG
柯仲璟
author KO,YU-HUNG
柯仲璟
spellingShingle KO,YU-HUNG
柯仲璟
Study of Machine Learning Algorithms on Spark
author_sort KO,YU-HUNG
title Study of Machine Learning Algorithms on Spark
title_short Study of Machine Learning Algorithms on Spark
title_full Study of Machine Learning Algorithms on Spark
title_fullStr Study of Machine Learning Algorithms on Spark
title_full_unstemmed Study of Machine Learning Algorithms on Spark
title_sort study of machine learning algorithms on spark
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/76239070110682716614
work_keys_str_mv AT koyuhung studyofmachinelearningalgorithmsonspark
AT kēzhòngjǐng studyofmachinelearningalgorithmsonspark
AT koyuhung zhíjīyúsparkzhījīqìxuéxíyǎnsuànfǎyánjiū
AT kēzhòngjǐng zhíjīyúsparkzhījīqìxuéxíyǎnsuànfǎyánjiū
_version_ 1718525509564891136