Summary: | Statistical machine learning methods provide us with a principled framework for extracting meaningful information from noisy high-dimensional data sets. A significant feature of such procedures is that the inferences made are statistically significant, computationally efficient and scientifically meaningful. In this thesis we make several contributions to such statistical procedures. Our contributions are two-fold.
We first address prediction and estimation problems in non-standard situations. We show that even when given no access to labeled samples, one can still consistently estimate error rate of predictors and train predictors with respect to a given (convex) loss function. We next propose an efficient procedure for predicting with large output spaces, that scales logarithmically in the dimensionality of the output space. We further propose an asymptotically optimal procedure for sparse multi-task learning when the tasks share a joint support. We show consistency of the proposed method and derive rates of convergence.
We next address the problem of learning meaningful representations of data. We propose a method for learning sparse representations that takes into account the structure of the data space and demonstrate how it enables one to obtain meaningful features. We establish sample complexity results for the proposed approach. We then propose a model-free feature selection procedure and establish its sure-screening property in the high dimensional regime. Furthermore we show that with a slight modification, the approach previously proposed for sparse multi-task learning enables one to obtain sparse representations for multiple related tasks simultaneously.
|