Summary: | Many algorithms exist for learning how to act in a repeated game and most have
theoretical guarantees associated with their behaviour. However, there are few
experimental results about the empirical performance of these algorithms, which is
important for any practical application of this work. Most of the empirical claims in
the literature to date have been based on small experiments, and this has hampered
the development of multiagent learning (MAL) algorithms with good performance
properties.
In order to rectify this problem, we have developed a suite of tools for running
multiagent experiments called the Multiagent Learning Testbed (MALT). These
tools are designed to facilitate running larger and more comprehensive experiments
by removing the need to code one-off experimental apparatus. MALT also provides
a number of public implementations of MAL algorithms—hopefully eliminating
or reducing differences between algorithm implementations and increasing
the reproducibility of results. Using this test-suite, we ran an experiment that is
unprecedented in terms of the number of MAL algorithms used and the number of
game instances generated. The results of this experiment were analyzed by using
a variety of performance metrics—including reward, maxmin distance, regret, and
several types of convergence. Our investigation also draws upon a number of empirical
analysis methods. Through this analysis we found some surprising results:
the most surprising observation was that a very simple algorithm—one that was
intended for single-agent reinforcement problems and not multiagent learning—
performed better empirically than more complicated and recent MAL algorithms.
|