Summary: | Trial and error learning methods are often ineffective when applied to robots. This is
due to certain characteristics found in robotic domains such as large continuous state
spaces, noisy sensors and faulty actuators. Learning algorithms work best with small
discrete state spaces, discrete deterministic actions, and accurate identification of state.
Since trial and error learning requires that an agent learn by trying actions under all
possible situations, the large continuous state space is the most problematic of the above
characteristics, causing the learning algorithm to become inefficient. There is rarely
enough time to explicitly visit every state or enough memory to store the best action for
every state.
This thesis explores methods for achieving reinforcement learning on large continuous
state spaces, where actions are not discrete. This is done by creating abstract states,
allowing one state to represent numerous similar states. This saves time since not every
state in the abstract state needs to be visited and saves space since only one state needs
to be stored.
The algorithm tested in this thesis learns which volumes of the state space are similar
by recursively subdividing each volume with a KD-tree. Identifying if an abstract state
should be split, which dimension should be split, and where that dimension should be
split is done by collecting statistics on the previous effects of actions. Continuous actions
are dealt with by giving actions inertia, so they can persist past state boundaries if it
necessary.
|