Active Learning in Black-Box Settings

Active learning refers to the settings in which a machine learning algorithm (learner) is able to select data from which it learns (selecting points and then obtaining their labels), and by doing so aims to achieve better accuracy (e.g., by avoiding obtaining training data that is redundant or unimp...

Full description

Bibliographic Details
Main Authors: Neil Rubens, Vera Sheinman, Ryota Tomioka, Masashi Sugiyama
Format: Article
Language:English
Published: Austrian Statistical Society 2016-02-01
Series:Austrian Journal of Statistics
Online Access:http://www.ajs.or.at/index.php/ajs/article/view/204
Description
Summary:Active learning refers to the settings in which a machine learning algorithm (learner) is able to select data from which it learns (selecting points and then obtaining their labels), and by doing so aims to achieve better accuracy (e.g., by avoiding obtaining training data that is redundant or unimportant). Active learning is particularly useful in cases where the labeling cost is high. A common assumption is that an active learning algorithm is aware of the details of the underlying learning algorithm for which it obtains the data. However, in many practical settings, obtaining precise details of the learning algorithm may not be feasible, making the underlying algorithm in essence a black box – no knowledge of the internal workings of the algorithm is available, and only the inputs and corresponding output estimates are accessible. This makes many of the traditional approaches not applicable, or at the least not effective. Hence our motivation is to use the only data that is accessible in black box settings – output estimates. We note that accuracy will improve only if the learner’s output estimates change. Therefore we propose active learning criterion that utilizes the information contained within the changes of output estimates.
ISSN:1026-597X