Summary: | The EM algorithm is one of the most popular statistical learning algorithms. Unfortunately, it is a batch learning method. For large data sets and real-time systems, we need to develop on-line methods. In this thesis, we present a comprehensive study of on-line EM algorithms. We use Bayesian theory to propose a new on-line EM algorithm for multinomial mixtures. Based on this theory, we show that there is a direct connection between the setting of Bayes priors and the so-called learning rates of stochastic approximation algorithms, such as on-line EM and quasi-Bayes . Finally, we present extensive simulations, comparisons and parameter sensitivity studies on both synthetic data and documents with text, images and music. === Science, Faculty of === Computer Science, Department of === Graduate
|