Using Las Vegas Filter to Select Features and Back-PropagationNeural Network to Adjust Outliers—Diabetes Database Example

碩士 === 華梵大學 === 資訊管理學系碩士班 === 98 === Recently data mining has been widely applied to medical diagnosis. But most medical databases are diverse, heterogeneous, and contain a large number of outliers in minority class. This situation affects the accuracy of follow-up data mining. Furthermore, it would...

Full description

Bibliographic Details
Main Authors: Yi-Ting Jiang, 蔣依婷
Other Authors: Tsung-Yuan Tseng
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/39128481276759847240
Description
Summary:碩士 === 華梵大學 === 資訊管理學系碩士班 === 98 === Recently data mining has been widely applied to medical diagnosis. But most medical databases are diverse, heterogeneous, and contain a large number of outliers in minority class. This situation affects the accuracy of follow-up data mining. Furthermore, it would lead to inadequate samples and affect the accuracy of following data classification if all records including outliers are choused to delete in the minority class of unbalanced database. Instead, it is the only way to readjust outliers and put records back to data mining. Taking diabetes databases as an example of outliers included in minority class of imbalanced database, this study adopts LVF (Las Vegas Filter) to select related features affecting outlier and BPN (Back-Propagation Neural) to adjust outliers in order to improve the accuracy rate of following data classification. Comparing traditional t test and X2 test, the result shows that LVF can select relevant attributes affecting data items containing outlier and thus improve accuracy rate of following data classification.