Active Search with Complex Actions and Rewards

Active search studies algorithms that can find all positive examples in an unknown environment by collecting and learning from labels that are costly to obtain. They start with a pool of unlabeled data, act to design queries, and get rewarded by the number of positive examples found in a long-term h...

Full description

Bibliographic Details
Main Author: Ma, Yifei
Format: Others
Published: Research Showcase @ CMU 2017
Online Access:http://repository.cmu.edu/dissertations/1115
http://repository.cmu.edu/cgi/viewcontent.cgi?article=2154&context=dissertations
id ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-2154
record_format oai_dc
spelling ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-21542018-01-26T03:22:01Z Active Search with Complex Actions and Rewards Ma, Yifei Active search studies algorithms that can find all positive examples in an unknown environment by collecting and learning from labels that are costly to obtain. They start with a pool of unlabeled data, act to design queries, and get rewarded by the number of positive examples found in a long-term horizon. Active search is connected to active learning, multi-armed bandits, and Bayesian optimization. To date, most active search methods are limited by assuming that the query actions and rewards are based on single data points in a low-dimensional Euclidean space. Many applications, however, define actions and rewards in a more complex way. For example, active search may be used to recommend items that are connected by a network graph, where the edges indicate item (node) similarity. The active search reward in environmental monitoring is defined by regions because pollution is only identified by finding an entire region with consistently large measurement outcomes. On the other hand, to efficiently search for sparse signal hotspots in a large area, aerial robots may act to query at high altitudes, taking the average value in an entire region. Finally, active search usually ignores the computational complexity in the design of actions, which is infeasible in large problems.We develop methods to address the disparate issues in the new problems. In a graph environment, the exploratory queries that reveal the most information about the user models are different than the Euclidean space. We used a new exploration criterion called Σ-optimality, which is motivated by a different objective, active surveying, yet empirically performed better due to a tendency to query cluster centers. We also showed submodularity-based guarantees that justify for greedy application of various heuristics including Σ-optimality and we performed regret analysis for active search with results comparable to existing literature. For active area search for region rewards, we designed an algorithm called APPS, which optimizes for one-step look-ahead expected rewards for finding positive regions with high probability. APPS was initially solved by Monte-Carlo estimates; but for simple objectives, e.g. to find region with large average pollution concentrations, APPS has a closed-form solution called AAS that connects to Bayesian quadrature. For active needle search with region queries using aerial robots, we pick queries to maximize the information gain about possible signal hotspot locations. Our method is called RSI and it reduces to bisection search if the measurements are noiseless and the signal hotspot is unique. Turning to noisy measurements, we showed that RSI has near-optimal expected number of measurements, which is comparable to results from compressive sensing (CS). On the other hand, CS relies on weighted averages, which are harder to realize than our use of plain averages. Finally, to address the scalability challenge, we borrow ideas from Thompson sampling, which approximates near-optimal decisions by drawing from the model uncertainty and using greedy decisions accordingly. Our method is conjugate sampling, which allows true computational benefits when the uncertainty is modeled with sparse or circulant matrices. 2017-05-01T07:00:00Z text application/pdf http://repository.cmu.edu/dissertations/1115 http://repository.cmu.edu/cgi/viewcontent.cgi?article=2154&context=dissertations Dissertations Research Showcase @ CMU
collection NDLTD
format Others
sources NDLTD
description Active search studies algorithms that can find all positive examples in an unknown environment by collecting and learning from labels that are costly to obtain. They start with a pool of unlabeled data, act to design queries, and get rewarded by the number of positive examples found in a long-term horizon. Active search is connected to active learning, multi-armed bandits, and Bayesian optimization. To date, most active search methods are limited by assuming that the query actions and rewards are based on single data points in a low-dimensional Euclidean space. Many applications, however, define actions and rewards in a more complex way. For example, active search may be used to recommend items that are connected by a network graph, where the edges indicate item (node) similarity. The active search reward in environmental monitoring is defined by regions because pollution is only identified by finding an entire region with consistently large measurement outcomes. On the other hand, to efficiently search for sparse signal hotspots in a large area, aerial robots may act to query at high altitudes, taking the average value in an entire region. Finally, active search usually ignores the computational complexity in the design of actions, which is infeasible in large problems.We develop methods to address the disparate issues in the new problems. In a graph environment, the exploratory queries that reveal the most information about the user models are different than the Euclidean space. We used a new exploration criterion called Σ-optimality, which is motivated by a different objective, active surveying, yet empirically performed better due to a tendency to query cluster centers. We also showed submodularity-based guarantees that justify for greedy application of various heuristics including Σ-optimality and we performed regret analysis for active search with results comparable to existing literature. For active area search for region rewards, we designed an algorithm called APPS, which optimizes for one-step look-ahead expected rewards for finding positive regions with high probability. APPS was initially solved by Monte-Carlo estimates; but for simple objectives, e.g. to find region with large average pollution concentrations, APPS has a closed-form solution called AAS that connects to Bayesian quadrature. For active needle search with region queries using aerial robots, we pick queries to maximize the information gain about possible signal hotspot locations. Our method is called RSI and it reduces to bisection search if the measurements are noiseless and the signal hotspot is unique. Turning to noisy measurements, we showed that RSI has near-optimal expected number of measurements, which is comparable to results from compressive sensing (CS). On the other hand, CS relies on weighted averages, which are harder to realize than our use of plain averages. Finally, to address the scalability challenge, we borrow ideas from Thompson sampling, which approximates near-optimal decisions by drawing from the model uncertainty and using greedy decisions accordingly. Our method is conjugate sampling, which allows true computational benefits when the uncertainty is modeled with sparse or circulant matrices.
author Ma, Yifei
spellingShingle Ma, Yifei
Active Search with Complex Actions and Rewards
author_facet Ma, Yifei
author_sort Ma, Yifei
title Active Search with Complex Actions and Rewards
title_short Active Search with Complex Actions and Rewards
title_full Active Search with Complex Actions and Rewards
title_fullStr Active Search with Complex Actions and Rewards
title_full_unstemmed Active Search with Complex Actions and Rewards
title_sort active search with complex actions and rewards
publisher Research Showcase @ CMU
publishDate 2017
url http://repository.cmu.edu/dissertations/1115
http://repository.cmu.edu/cgi/viewcontent.cgi?article=2154&context=dissertations
work_keys_str_mv AT mayifei activesearchwithcomplexactionsandrewards
_version_ 1718611852875792384