Summary: | 碩士 === 國立交通大學 === 工業工程與管理系所 === 93 === Classification is one of the main tasks of data mining. To execute classification efficiently, feature selection is usually merged into establishing a classification model. In binary classification problems, the ratio of the number of examples belonging to two classes in training data set is an important factor that impacts the effective learning of the classification model. If a data set contains several examples from one class and few examples from the other, we call it imbalanced data. There will be bias in the classification model that is learned from imbalanced training data set and this will result in lower sensitivity of detecting the class which has few examples in training data set. MTS is a new diagnosis and forecasting technique for multivariate data. MTS establishes a classification model by constructing a continuous measurement scale rather than learning from training data set. Therefore, MTS is not influenced by data distribution. This study compared MTS with other classification techniques and found that MTS is an outperforming and robust technique for imbalanced data. In addition, this study proposed a probabilistic threshold according to Chebyshev’s theorem for MTS and probabilistic threshold derives good classification performance. Finally, MTS was employed to analyze the RF test process in mobile phone manufacture. The data coming from RF test process is typically imbalanced type. Implementation results showed that the test attributes have been significantly reduced and RF test process could also maintain high inspection accuracy.
|