A general framework for reducing variance in agent evaluation

In this work, we present a unified, general approach to variance reduction in agent evaluation using machine learning to minimize variance. Evaluating an agent's performance in a stochastic setting is necessary for agent development, scientific evaluation, and competitions. Traditionally, evalu...

Full description

Bibliographic Details
Main Author:	White, Martha
Other Authors:	Bowling, Michael (Computing Science)
Format:	Others
Language:	en
Published:	2010
Subjects:	variance reduction machine learning agent evaluation
Online Access:	http://hdl.handle.net/10048/890

id	ndltd-LACETR-oai-collectionscanada.gc.ca-AEU.10048-890
record_format	oai_dc
spelling	ndltd-LACETR-oai-collectionscanada.gc.ca-AEU.10048-8902012-03-21T22:50:08ZBowling, Michael (Computing Science)Schuurmans, Dale (Computing Science)White, Martha2010-01-06T18:18:02Z2010-01-06T18:18:02Z2010-01-06T18:18:02Zhttp://hdl.handle.net/10048/890In this work, we present a unified, general approach to variance reduction in agent evaluation using machine learning to minimize variance. Evaluating an agent's performance in a stochastic setting is necessary for agent development, scientific evaluation, and competitions. Traditionally, evaluation is done using Monte Carlo estimation (sample averages); the magnitude of the stochasticity in the domain or the high cost of sampling, however, can often prevent the approach from resulting in statistically significant conclusions. Recently, an advantage sum technique based on control variates has been proposed for constructing unbiased, low variance estimates of agent performance. The technique requires an expert to define a value function over states of the system, essentially a guess of the state's unknown value. In this work, we propose learning this value function from past interactions between agents in some target population. Our learned value functions have two key advantages: they can be applied in domains where no expert value function is available and they can result in tuned evaluation for a specific population of agents (e.g., novice versus advanced agents). This work has three main contributions. First, we consolidate previous work in using control variates for variance reduction into one unified, general framework and summarize the connections between this previous work. Second, our framework makes variance reduction practically possible in any sequential decision making task where designing the expert value function is time-consuming, difficult or essentially impossible. We prove the optimality of our approach and extend the theoretical understanding of advantage sum estimators. In addition, we significantly extend the applicability of advantage sum estimators and discuss practical methods for using our framework in real-world scenarios. Finally, we provide low-variance estimators for three poker domains previously without variance reduction and improve strategy selection in the expert-level University of Alberta poker bot.392540 bytesapplication/pdfenMartha White and Michael Bowling. Learning a value analysis tool for agent evaluation. In Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence, pages 1976-1981, 2009variance reductionmachine learningagent evaluationA general framework for reducing variance in agent evaluationThesisMaster of ScienceMaster'sDepartment of Computing ScienceUniversity of Alberta2010-06Szafron, Duane (Computing Science)Hooper, Peter (Mathematical and Statistical Sciences)
collection	NDLTD
language	en
format	Others
sources	NDLTD
topic	variance reduction machine learning agent evaluation
spellingShingle	variance reduction machine learning agent evaluation White, Martha A general framework for reducing variance in agent evaluation
description	In this work, we present a unified, general approach to variance reduction in agent evaluation using machine learning to minimize variance. Evaluating an agent's performance in a stochastic setting is necessary for agent development, scientific evaluation, and competitions. Traditionally, evaluation is done using Monte Carlo estimation (sample averages); the magnitude of the stochasticity in the domain or the high cost of sampling, however, can often prevent the approach from resulting in statistically significant conclusions. Recently, an advantage sum technique based on control variates has been proposed for constructing unbiased, low variance estimates of agent performance. The technique requires an expert to define a value function over states of the system, essentially a guess of the state's unknown value. In this work, we propose learning this value function from past interactions between agents in some target population. Our learned value functions have two key advantages: they can be applied in domains where no expert value function is available and they can result in tuned evaluation for a specific population of agents (e.g., novice versus advanced agents). This work has three main contributions. First, we consolidate previous work in using control variates for variance reduction into one unified, general framework and summarize the connections between this previous work. Second, our framework makes variance reduction practically possible in any sequential decision making task where designing the expert value function is time-consuming, difficult or essentially impossible. We prove the optimality of our approach and extend the theoretical understanding of advantage sum estimators. In addition, we significantly extend the applicability of advantage sum estimators and discuss practical methods for using our framework in real-world scenarios. Finally, we provide low-variance estimators for three poker domains previously without variance reduction and improve strategy selection in the expert-level University of Alberta poker bot.
author2	Bowling, Michael (Computing Science)
author_facet	Bowling, Michael (Computing Science) White, Martha
author	White, Martha
author_sort	White, Martha
title	A general framework for reducing variance in agent evaluation
title_short	A general framework for reducing variance in agent evaluation
title_full	A general framework for reducing variance in agent evaluation
title_fullStr	A general framework for reducing variance in agent evaluation
title_full_unstemmed	A general framework for reducing variance in agent evaluation
title_sort	general framework for reducing variance in agent evaluation
publishDate	2010
url	http://hdl.handle.net/10048/890
work_keys_str_mv	AT whitemartha ageneralframeworkforreducingvarianceinagentevaluation AT whitemartha generalframeworkforreducingvarianceinagentevaluation
_version_	1716391444618412032

A general framework for reducing variance in agent evaluation

Similar Items