Summary: | Inductive learning of statistical models from relational data is a key problem in artificial intelligence. Two main approaches exist for learning with relational data, and this thesis shows how they can be combined in a uniform framework.
The first approach aims to learn dependencies amongst features (relations and properties), e.g. how users' purchases of products depend on users' preferences of the products and associated properties of users and products. Such models abstract over individuals, and are compact and easy to interpret.
The second approach learns latent properties of individuals that explain the observed features, without modelling interdependencies amongst features. Latent-property models have demonstrated good predictive accuracy in practise, and are especially useful when few properties and relations are observed. Interesting latent groupings of individuals can be discovered.
Our approach aims to learn a unified representation for dependency structures for both observed features and latent properties. We develop a simple approximate EM algorithm for learning the unified representation, and experiments demonstrate cases when our algorithm can generate models that predicts better than dependency-based models of observed features as well as a state-of-the-art latent-property model.
We extend our approximate EM algorithm to handle uncertainty about the number of values for latent properties. We search over the number of values and return error bounds, as an alternative to existing proposals based on sampling in the posterior distribution over the number of values.
We also solve a specific case where dependencies involve functional relations, which induces a verbose model with many parameters. In comparison, the standard solution of aggregating over all values of the function yields a simple model that predicts poorly. We show how to learn an optimal intermediate-size representation efficiently by clustering the values of the function. The proposed method generates models that capture interesting clusters of function values, dominates the simple model in prediction, and can surpass the verbose model using much fewer parameters. === Science, Faculty of === Computer Science, Department of === Graduate
|