Mutual Reinforcement Learning
Indiana University-Purdue University Indianapolis (IUPUI) === Mutual learning is an emerging field in intelligent systems which takes inspiration from naturally intelligent agents and attempts to explore how agents can communicate and coop- erate to share information and learn more quickly. While...
Main Author: | |
---|---|
Other Authors: | |
Language: | en_US |
Published: |
2021
|
Subjects: | |
Online Access: | http://hdl.handle.net/1805/25957 http://dx.doi.org/10.7912/C2/11 |
Summary: | Indiana University-Purdue University Indianapolis (IUPUI) === Mutual learning is an emerging field in intelligent systems which takes inspiration from
naturally intelligent agents and attempts to explore how agents can communicate and coop-
erate to share information and learn more quickly. While agents in many biological systems
have little trouble learning from one another, it is not immediately obvious how artificial
agents would achieve similar learning. In this thesis, I explore how agents learn to interact
with complex systems. I further explore how these complex learning agents may be able
to transfer knowledge to one another to improve their learning performance when they are
learning together and have the power of communication. While significant research has been
done to explore the problem of knowledge transfer, the existing literature is concerned ei-
ther with supervised learning tasks or relatively simple discrete reinforcement learning. The
work presented here is, to my knowledge, the first which admits continuous state spaces and
deep reinforcement learning techniques. The first contribution of this thesis, presented in
Chapter 2, is a modified version of deep Q-learning which demonstrates improved learning
performance due to the addition of a mutual learning term which penalizes disagreement
between mutually learning agents. The second contribution, in Chapter 3, is a presentation
work which describes effective communication of agents which use fundamentally different
knowledge representations and systems of learning (model-free deep Q learning and model-
based adaptive dynamic programming), and I discuss how the agents can mathematically
negotiate their trust in one another to achieve superior learning performance. I conclude
with a discussion of the promise shown by this area of research and a discussion of problems
which I believe are exciting directions for future research. |
---|