Summary: | 摘 要
過去對於保險資料的研究多採用傳統統計方法,然而保險公司龐大資料庫中蘊含的寶貴資訊可能因此被遺漏。
本研究目的是將資料採礦的技術應用到保險公司資料庫中的高雄縣市保戶保單貸款資料上,研究保戶利用保單貸款的行為,做為保險公司日後推行保單貸款的參考。
從整理過後的資料中,用不同抽樣方法抽出不同樣本大小以及不同是否貸款比例的樣本,將連續變數做轉換後,建立決策樹和類神經模型,透過統計上的變異數分析,討論四個因子對預測結果好壞的影響。選出最好組合的樣本大小、是否貸款比例(已貸款:尚未貸款)、抽樣方法、以及建立的模型。
最後將此最佳組合建立的C4.5決策樹轉換成規則,並探討其中正確率較高的幾項,作為給保險公司的參考。
=== Abstract
In the past, the analysis of insurance data is usually conducted with traditional statistical methods, however a large amount of valuable information hidden might be left undiscovered.
The purpose of this research is to apply data mining techniques to customer policy data taken from one of insurance company’s database in Kaoshuing city and county to study the behavior of customers taking loans against their policies as a reference for insurance company in promoting policy in the future.
From the cleansed data, we sample policies of different sizes and percentage of policies with loans by different sampling methods, decision trees and neural network models, then through the significant interactions of ANOVA, discuss how the results being influenced by the four factors. We then choose the best model that manifests factors affecting customer’s behavior in taking out the loan thus providing insurance company a vital information in targeting its customers group.
|