Learning from Observation Using Primitives
Learning without any prior knowledge in environments that contain large or continuous state spaces is a daunting task. For robots that operate in the real world, learning must occur in a reasonable amount of time. Providing a robot with domain knowledge and also with the ability to learn from watchi...
Main Author: | |
---|---|
Format: | Others |
Language: | en_US |
Published: |
Georgia Institute of Technology
2005
|
Subjects: | |
Online Access: | http://hdl.handle.net/1853/5100 |
Summary: | Learning without any prior knowledge in environments that contain large or continuous state spaces is a daunting task. For robots that operate in the real world, learning must occur in a reasonable amount of time. Providing a robot with domain knowledge and also with the ability to learn from watching others can greatly increase its learning rate. This research explores learning algorithms that can learn quickly and make the most use of information obtained from observing others. Domain knowledge is encoded in the form of primitives, small parts of a task that are executed many times while a task is being performed. This thesis explores and presents many challenges involved in programming robots to learn and adapt to environments that humans operate in.
A "Learning from Observation Using Primitives" framework has been created that provides the means to observe primitives as they are performed by others. This information is used by the robot in a three level process as it performs in the environment. In the first level the robot chooses a primitive to use for the observed state. The second level decides on the manner in which the chosen primitive will be performed. This information is then used in the third level to control the robot as necessary to perform the desired action. The framework also provides a means for the robot to observe and evaluate its own actions as it performs in the environment which allows the robot to increase its performance of selecting and performing the primitives.
The framework and algorithms have been evaluated on two testbeds: Air Hockey and Marble Maze. The tasks are done both by actual robots and in simulation. Our robots have the ability to observe humans as they operate in these environments. The software version of Air Hockey allows a human to play against a cyber player and the hardware version allows the human to play against a 30 degree-of-freedom humanoid robot. The implementation of our learning system in these tasks helps to clearly present many issues involved in having robots learn and perform in dynamic environments. |
---|