Temporal difference learning in complex domains
This thesis adapts and improves on the methods of TD(k) (Sutton 1988) that were successfully used for backgammon (Tesauro 1994) and applies them to other complex games that are less amenable to simple pattem-matching approaches. The games investigated are chess and shogi, both of which (unlike backg...
Main Author: | |
---|---|
Published: |
Queen Mary, University of London
1999
|
Subjects: | |
Online Access: | https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.313294 |