Non-asymptotic bounds for prediction problems and density estimation.

This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active...

Full description

Bibliographic Details
Main Author: Minsker, Stanislav
Published: Georgia Institute of Technology 2012
Subjects:
Online Access:http://hdl.handle.net/1853/44808
id ndltd-GATECH-oai-smartech.gatech.edu-1853-44808
record_format oai_dc
spelling ndltd-GATECH-oai-smartech.gatech.edu-1853-448082013-01-17T09:07:41ZNon-asymptotic bounds for prediction problems and density estimation.Minsker, StanislavActive learningSparse recoveryOracle inequalityConfidence bandsInfinite dictionaryEstimation theory Asymptotic theoryEstimation theoryDistribution (Probability theory)Prediction theoryActive learningAlgorithmsMathematical optimizationChebyshev approximationThis dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning. Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem. The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance.Georgia Institute of Technology2012-09-20T18:20:29Z2012-09-20T18:20:29Z2012-07-05Dissertationhttp://hdl.handle.net/1853/44808
collection NDLTD
sources NDLTD
topic Active learning
Sparse recovery
Oracle inequality
Confidence bands
Infinite dictionary
Estimation theory Asymptotic theory
Estimation theory
Distribution (Probability theory)
Prediction theory
Active learning
Algorithms
Mathematical optimization
Chebyshev approximation
spellingShingle Active learning
Sparse recovery
Oracle inequality
Confidence bands
Infinite dictionary
Estimation theory Asymptotic theory
Estimation theory
Distribution (Probability theory)
Prediction theory
Active learning
Algorithms
Mathematical optimization
Chebyshev approximation
Minsker, Stanislav
Non-asymptotic bounds for prediction problems and density estimation.
description This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning. Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem. The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance.
author Minsker, Stanislav
author_facet Minsker, Stanislav
author_sort Minsker, Stanislav
title Non-asymptotic bounds for prediction problems and density estimation.
title_short Non-asymptotic bounds for prediction problems and density estimation.
title_full Non-asymptotic bounds for prediction problems and density estimation.
title_fullStr Non-asymptotic bounds for prediction problems and density estimation.
title_full_unstemmed Non-asymptotic bounds for prediction problems and density estimation.
title_sort non-asymptotic bounds for prediction problems and density estimation.
publisher Georgia Institute of Technology
publishDate 2012
url http://hdl.handle.net/1853/44808
work_keys_str_mv AT minskerstanislav nonasymptoticboundsforpredictionproblemsanddensityestimation
_version_ 1716575697480187904