Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World

碩士 === 國立臺北護理健康大學 === 資訊管理研究所 === 107 === In recent years, with the development of the economy, all countries in the world hope to achieve the goal of improving the health and the life expectancy of their people. How to extend the life expectancy of the people is an issue of concern to all countries...

Full description

Bibliographic Details
Main Authors: YAN, SHI-XIANG, 顏士祥
Other Authors: JIANG, WEY-WEN
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/a6k8ta
id ndltd-TW-107NTCN0396005
record_format oai_dc
spelling ndltd-TW-107NTCN03960052019-10-24T05:20:15Z http://ndltd.ncl.edu.tw/handle/a6k8ta Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World 運用開放資料以決策樹演算法探討全球各國預期壽命之相關因素 YAN, SHI-XIANG 顏士祥 碩士 國立臺北護理健康大學 資訊管理研究所 107 In recent years, with the development of the economy, all countries in the world hope to achieve the goal of improving the health and the life expectancy of their people. How to extend the life expectancy of the people is an issue of concern to all countries. Therefore, this study hopes to find out the important factors affecting the life expectancy of the people from various aspects, so as to make suggestions for ways to improve the life expectancy of the people. In order to take a comprehensive survey of the overall situation, economic development, network infrastructure, education level, national nutrition, physical activity and other factors are included in research considerations. To find out whether these factors affect the national life expectancy is the scope of this research. From the five major projects out of the United Nations' Sustainable Development Goals in 2015 which include "Good Health and Well-Being", "Quality Education", "Decent Work and Economic Growth", "Industry, Innovation and Infrastructure", and "Sustainable Cities and Communities", this research found out the numerical open data from international platform, and collected data of 122 countries, a total of 21 independent variables, then conducting an exploration. First, the correlation coefficient was calculated to understand the correlation between the 21 independent variables and the life expectancy. Then, through the linear regression, the 21 independent variables were used to create a model of the life expectancy. Finally, the decision tree algorithm was deployed to induct the segmentation rules of the 21 independent variables and output the life expectancy. According to the correlation coefficient, the excessive total cholesterol (≥ 190mg/dL), expected years of schooling, and internet users were highly and positively correlated with life expectancy. When creating a model by multiple linear regression, the obtained R-squared value was as high as 0.86, and the explanatory power was good. Decision tree analysis showed that total cholesterol (≥ 190mg/dL), cardiovascular diseases, blood pressure, GDP per capita, fruit and veg gram, and cancer prevalence rate were all important factors in decision-making segmentation. This study found that apart from the higher GDP, higher expected years of education, higher network usage, and higher fruit and vegetable supply also caused higher life expectancy. Therefore, it is recommended that countries should keep working hard to upgrade the above four indicators. JIANG, WEY-WEN 江蔚文 2019 學位論文 ; thesis 133 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺北護理健康大學 === 資訊管理研究所 === 107 === In recent years, with the development of the economy, all countries in the world hope to achieve the goal of improving the health and the life expectancy of their people. How to extend the life expectancy of the people is an issue of concern to all countries. Therefore, this study hopes to find out the important factors affecting the life expectancy of the people from various aspects, so as to make suggestions for ways to improve the life expectancy of the people. In order to take a comprehensive survey of the overall situation, economic development, network infrastructure, education level, national nutrition, physical activity and other factors are included in research considerations. To find out whether these factors affect the national life expectancy is the scope of this research. From the five major projects out of the United Nations' Sustainable Development Goals in 2015 which include "Good Health and Well-Being", "Quality Education", "Decent Work and Economic Growth", "Industry, Innovation and Infrastructure", and "Sustainable Cities and Communities", this research found out the numerical open data from international platform, and collected data of 122 countries, a total of 21 independent variables, then conducting an exploration. First, the correlation coefficient was calculated to understand the correlation between the 21 independent variables and the life expectancy. Then, through the linear regression, the 21 independent variables were used to create a model of the life expectancy. Finally, the decision tree algorithm was deployed to induct the segmentation rules of the 21 independent variables and output the life expectancy. According to the correlation coefficient, the excessive total cholesterol (≥ 190mg/dL), expected years of schooling, and internet users were highly and positively correlated with life expectancy. When creating a model by multiple linear regression, the obtained R-squared value was as high as 0.86, and the explanatory power was good. Decision tree analysis showed that total cholesterol (≥ 190mg/dL), cardiovascular diseases, blood pressure, GDP per capita, fruit and veg gram, and cancer prevalence rate were all important factors in decision-making segmentation. This study found that apart from the higher GDP, higher expected years of education, higher network usage, and higher fruit and vegetable supply also caused higher life expectancy. Therefore, it is recommended that countries should keep working hard to upgrade the above four indicators.
author2 JIANG, WEY-WEN
author_facet JIANG, WEY-WEN
YAN, SHI-XIANG
顏士祥
author YAN, SHI-XIANG
顏士祥
spellingShingle YAN, SHI-XIANG
顏士祥
Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World
author_sort YAN, SHI-XIANG
title Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World
title_short Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World
title_full Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World
title_fullStr Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World
title_full_unstemmed Using Open Data and Decision Tree Algorithm to Explore Factors Associated with Life Expectancy around the World
title_sort using open data and decision tree algorithm to explore factors associated with life expectancy around the world
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/a6k8ta
work_keys_str_mv AT yanshixiang usingopendataanddecisiontreealgorithmtoexplorefactorsassociatedwithlifeexpectancyaroundtheworld
AT yánshìxiáng usingopendataanddecisiontreealgorithmtoexplorefactorsassociatedwithlifeexpectancyaroundtheworld
AT yanshixiang yùnyòngkāifàngzīliàoyǐjuécèshùyǎnsuànfǎtàntǎoquánqiúgèguóyùqīshòumìngzhīxiāngguānyīnsù
AT yánshìxiáng yùnyòngkāifàngzīliàoyǐjuécèshùyǎnsuànfǎtàntǎoquánqiúgèguóyùqīshòumìngzhīxiāngguānyīnsù
_version_ 1719276961927266304