Using Unsupervised Learning Method to Supplement Supervised Learning Method

碩士 === 國立陽明大學 === 公共衛生研究所 === 91 === Unsupervised learning method only concerns about the characteristics of variables, Xs without specific any dependent variable. The goal of this method is to infer the properties of Xs without specifying dependent variable. The aim of this...

Full description

Bibliographic Details
Main Authors: Yen-Chih Hsu, 許硯智
Other Authors: Chong-Yau Fu
Format: Others
Language:zh-TW
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/60102868894569961661
Description
Summary:碩士 === 國立陽明大學 === 公共衛生研究所 === 91 === Unsupervised learning method only concerns about the characteristics of variables, Xs without specific any dependent variable. The goal of this method is to infer the properties of Xs without specifying dependent variable. The aim of this thesis is to use the results of the unsupervised learning method to help the construction of the logistic regression model (supervised learning method). One of the unsupervised learning methods is cluster analysis which uncovers the data structure contained in the original dataset. It groups observations that they are close to each other; each observation in the same cluster shares similar characteristics. The hierarchical and non-hierarchical methods are two categories of cluster analysis. Logistic regression model is the representative of supervised learning methods in this thesis. The data was collected in the department of obstetrics and gynecology in Taipei Veterans General Hospital from Jan. to Dec. in 2002. The purpose of this study is to investigate the degree of pulse waveform damping, also called pulsatility index, in different vessels to see which combinations are the most sensitive to evaluate the pregnant outcome. In this thesis, we use average linkage algorithm in hierarchical method and k-means algorithm in non-hierarchical method to group pulsatility indices of vessels. The results of cluster analysis give us statistical evidence to group the similar variables. The grouping results are applied for constructing logistic model. The final model reveals that umbilical artery, ductus venosus, and pulmonary veins series are good predictor for judging pregnant outcomes.