Summary: | 博士 === 國立清華大學 === 工業工程與工程管理學系 === 97 === In recent years, data mining has attracted a great deal of attention in information industry because of the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge. The information and knowledge gained can be used for applications of business management, production control, engineering design, and so on. The Mahalanobis-Taguchi System (MTS), developed by Dr. Taguchi, is a relatively new data mining tool. MTS is a collection of methods proposed for diagnosis, forecasting, binary classification, and feature selection technique using multivariate data, and has been successfully used in various applications.
This study aims to explore and extend the theory of MTS and seeks to improve its existing limit and drawbacks in both theoretical and practical domains to reinforce the reliability and practicality of MTS. Finally, several real case problems are employed and solved to specifically show the benefit coming from implementing the above-mentioned studies. The contents of this study are described as follows:
In the theoretical aspect, this study investigates the reliability and robustness of MTS for dealing with the “class imbalance problems”. In the class imbalance problems, one class might be represented by a large number of examples, while the other class, usually the more important class, is represented by only a few. Class imbalance problems always diminish the performance of classification algorithms and cause classification bias. That is, the tendency is that the classifier will produce high predictive accuracy over the majority class, but will predict poor over the minority class. This may lead to a great loss for whole system. Besides, to solve the pending practical issue of determining the classification threshold for MTS, we also develop a “probabilistic thresholding method” on the basis of the Chebyshev’s theorem to derive an appropriate threshold for binary classification. On the other hand, because of the frequent occurrence of multi-class problems in real applications, a novel multi-class classification and feature selection method, namely, multi-class Mahalanobis-Taguchi System (MMTS) is developed on the basis of MTS theory. Through establishing an individual Mahalanobis space for each of the multiple classes and applying the proposed “weighted Mahalanobis distance” as the distance metric for classification, MMTS can achieve the application of multiple classes. For validating our point of view and the proposed methodologies, some datasets are used in the numerical experiments and comparisons.
In the application aspect, three real cases are solved using MTS and our proposed MMTS. The purpose of first case is to reduce the number of radio frequency inspection attributes in the mobile phone manufacturing process. In this case, there are two inspection outcomes: pass and fail, and the collected data are typically imbalanced. Thus, MTS with our proposed probabilistic threshoding method is employed to detect and remove the redundant inspection attributes. The results show that the number of attributes is significantly reduced without losing inspection accuracy. The second case is about predicting the development of type 2 diabetes mellitus from gestational diabetes mellitus. This case is a multi-class application, and therefore we use the proposed MMTS to identify the significant risk factors of developing type 2 diabetes mellitus from gestational diabetes mellitus and further predict the occurrence of type 2 diabetes mellitus. Through MMTS, good prediction accuracy is obtained and the risk factors are found out. By monitoring the risk factors, medical personnel can effectively take care of the gestational diabetes mellitus women and thus help prevent from the occurrence of type 2 diabetes mellitus and ensure their health. The final case attempts to establish an automatic multi-class timbre classification system (AMTCS) to prevent from the timbre judgment bias caused from human hearing and increase the accuracy and reliability of timbre quality inspection in alto saxophone manufacture. For this purpose, in addition to employing MMTS, a feature extraction method, called “waveform shape-based feature extraction method (WFEM)”, for one-dimensional signal recognition, such as vibration and sound is developed and used to extract the saxophone sound features. Through employing the AMTCS, strong assistance are provided to implement the final timbre inspection of alto saxophone. The results show that AMTCS achieves 100% saxophone timbre inspection accuracy.
|