The Impact of Sample Size and Feature Selection in Classification of Alzheimer's Disease

碩士 === 國立陽明大學 === 腦科學研究所 === 99 === There have been increasing interests in applying classification methods to discriminate neurodegenerative diseases using anatomical MRI. Using classification approach on neuroimaging data, the high dimensionality may cause the instability of training classifiers....

Full description

Bibliographic Details
Main Authors: Ai-Ling Hsu, 許艾伶
Other Authors: Ching-Po Lin
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/16168003497208394739
Description
Summary:碩士 === 國立陽明大學 === 腦科學研究所 === 99 === There have been increasing interests in applying classification methods to discriminate neurodegenerative diseases using anatomical MRI. Using classification approach on neuroimaging data, the high dimensionality may cause the instability of training classifiers. To reduce the dimensionality problem, feature selections are often applied. However, the benefits of feature selection in classification still remain controversial, possibly due to a variety of sample size and feature selection method. In this study,we hypothesizes that the benefit of feature selection is related to the training size, and different methods of feature selection. We tested four common feature selection methods. 1) Pre-selected region of interests (ROIs) that are based on prior knowledge. 2) univariate t-test filtering. 3) Recursive feature elimination (RFE). 4) t-test filtering constrained by ROIs. We also tested if the advantage of feature selection may change with different training sizes. The classification accuracies were compared between using feature selection and not using feature selection in different sample sizes. We used the T1 anatomical scan in the Alzheimer's disease Neuroimaging Initiative (ADNI) as the input feature to classify Alzheimer’s disease patients and normal, and to classify mild cognitive impairment patients and normal. The classification accuracies from two data-driven methods (t-test filtering and RFE) were no better than whole brain without feature selection. Using ROIs of hippocampus and parahippocampal gyrus resulted the best classification accuracies. In general, larger sample sizes yielded higher accuracies. When the sample size was large, feature selection had less advantages.