Summary: | In recent years, probabilistic models have become fundamental techniques in machine learning. They are successfully applied in various engineering problems, such as robotics, biometrics, brain-computer interfaces or artificial vision, and will gain in importance in the near future. This work deals with the difficult, but common situation where the data is, either very noisy, or scarce compared to the complexity of the process to model. We focus on latent variable models, which can be formalized as probabilistic graphical models and learned by the expectation-maximization algorithm or its variants (e.g., variational Bayes).<br>
After having carefully studied a non-exhaustive list of multivariate kernel density estimators, we established that in most applications locally adaptive estimators should be preferred. Unfortunately, these methods are usually sensitive to outliers and have often too many parameters to set. Therefore, we focus on finite mixture models, which do not suffer from these drawbacks provided some structural modifications.<br>
Two questions are central in this dissertation: (i) how to make mixture models robust to noise, i.e. deal efficiently with outliers, and (ii) how to exploit side-channel information, i.e. additional information intrinsic to the data. In order to tackle the first question, we extent the training algorithms of the popular Gaussian mixture models to the Student-t mixture models. the Student-t distribution can be viewed as a heavy-tailed alternative to the Gaussian distribution, the robustness being tuned by an extra parameter, the degrees of freedom. Furthermore, we introduce a new variational Bayesian algorithm for learning Bayesian Student-t mixture models. This algorithm leads to very robust density estimators and clustering. To address the second question, we introduce manifold constrained mixture models. This new technique exploits the information that the data is living on a manifold of lower dimension than the dimension of the feature space. Taking the implicit geometrical data arrangement into account results in better generalization on unseen data.<br>
Finally, we show that the latent variable framework used for learning mixture models can be extended to construct probabilistic regularization networks, such as the Relevance Vector Machines. Subsequently, we make use of these methods in the context of an optic nerve visual prosthesis to restore partial vision to blind people of whom the optic nerve is still functional. Although visual sensations can be induced electrically in the blind's visual field, the coding scheme of the visual information along the visual pathways is poorly known. Therefore, we use probabilistic models to link the stimulation parameters to the features of the visual perceptions. Both black-box and grey-box models are considered. The grey-box models take advantage of the known neurophysiological information and are more instructive to medical doctors and psychologists.<br>
|