Summary: | 碩士 === 國立中山大學 === 電機工程學系研究所 === 103 === This thesis is divided into two parts. The first part is a machine learning-based feature extraction method for regression problem. The second part is an incremental learning method for equality constrained-optimization-based extreme learning machine(C-ELM) (IC-ELM).
One of the issues encountered in classification and regression is the inefficiency caused by a large number of dimensions or features involved in the input space. Many approaches have been proposed to handle this issue by reducing the number of dimensions associated with the underlying data set, and statistical methods seem to have more prevailed in this area. However, less attention to dimensionality reduction has been paid for regression than for classification. Besides, the computation with covariance matrices is involved in most existing methods, resulting in an inefficient reduction process. In this thesis, we propose a machine learning based dimensionality reduction approach for regression problems. Given a set of historical data, the predictor vectors involved are grouped into a number of clusters such that the instances included in the same cluster are similar to one another. The user need not specify the number of clusters in advance. The clusters are created incrementally and the number of them is determined automatically. Finally, one feature is extracted from a cluster by a weighted combination of the instances contained in the cluster. Therefore, the dimensionality of the original data set is reduced. Since all the original features contribute to the making of the extracted features, the characteristics of the original data set can be substantially retained. Also, the computation with covariance matrices is avoided, and thus efficiency is maintained. Experimental results on real-world data sets validate the effectiveness of the proposed approach.
The Equality Constrained-Optimization-based Extreme Learning Machine here we abbreviate C-ELM was proposed by Huang et al. It’s input weights and biases of the neurons in the hidden layer are randomly assigned. It just determined the output weights analytically. When using C-ELM as a predicted model, the number of neurons in the hidden layer here we called hidden nodes must be decided previously same as using neural networks. When the performance of model is not good, we must trial-and-error to get a satisfied performance. But trial-and-error is inefficient, so we propose a incremental learning of C-ELM called IC-ELM. IC-ELM can add hidden nodes one by one or group by group and the output weights can update automatically when the number of hidden nodes changed. It doesn’t like C-ELM need to produce a recomputation of output weights when the number of hidden nodes changed. The adding procedure stopped when satisfying the pre-defined threshold. Experimental results are shown that IC-ELM is faster than C-ELM and achieve similar performance to C-ELM.
|