Non-asymptotic bounds for prediction problems and density estimation.
This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active...
Main Author: | |
---|---|
Published: |
Georgia Institute of Technology
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/1853/44808 |
id |
ndltd-GATECH-oai-smartech.gatech.edu-1853-44808 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-GATECH-oai-smartech.gatech.edu-1853-448082013-01-17T09:07:41ZNon-asymptotic bounds for prediction problems and density estimation.Minsker, StanislavActive learningSparse recoveryOracle inequalityConfidence bandsInfinite dictionaryEstimation theory Asymptotic theoryEstimation theoryDistribution (Probability theory)Prediction theoryActive learningAlgorithmsMathematical optimizationChebyshev approximationThis dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning. Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem. The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance.Georgia Institute of Technology2012-09-20T18:20:29Z2012-09-20T18:20:29Z2012-07-05Dissertationhttp://hdl.handle.net/1853/44808 |
collection |
NDLTD |
sources |
NDLTD |
topic |
Active learning Sparse recovery Oracle inequality Confidence bands Infinite dictionary Estimation theory Asymptotic theory Estimation theory Distribution (Probability theory) Prediction theory Active learning Algorithms Mathematical optimization Chebyshev approximation |
spellingShingle |
Active learning Sparse recovery Oracle inequality Confidence bands Infinite dictionary Estimation theory Asymptotic theory Estimation theory Distribution (Probability theory) Prediction theory Active learning Algorithms Mathematical optimization Chebyshev approximation Minsker, Stanislav Non-asymptotic bounds for prediction problems and density estimation. |
description |
This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning.
Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem.
The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance. |
author |
Minsker, Stanislav |
author_facet |
Minsker, Stanislav |
author_sort |
Minsker, Stanislav |
title |
Non-asymptotic bounds for prediction problems and density estimation. |
title_short |
Non-asymptotic bounds for prediction problems and density estimation. |
title_full |
Non-asymptotic bounds for prediction problems and density estimation. |
title_fullStr |
Non-asymptotic bounds for prediction problems and density estimation. |
title_full_unstemmed |
Non-asymptotic bounds for prediction problems and density estimation. |
title_sort |
non-asymptotic bounds for prediction problems and density estimation. |
publisher |
Georgia Institute of Technology |
publishDate |
2012 |
url |
http://hdl.handle.net/1853/44808 |
work_keys_str_mv |
AT minskerstanislav nonasymptoticboundsforpredictionproblemsanddensityestimation |
_version_ |
1716575697480187904 |