Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero

In this work, the algorithm used by AlphaZero is adapted for dots and boxes, a two-player game. This algorithm is explored using different numbers of convolutional filters and training loops, in order to better understand the effect these parameters have on the learning of the player. Different boar...

Full description

Bibliographic Details
Main Author: Prince, Jared
Format: Others
Published: TopSCHOLAR® 2018
Subjects:
Online Access:https://digitalcommons.wku.edu/theses/3087
https://digitalcommons.wku.edu/cgi/viewcontent.cgi?article=4090&context=theses
id ndltd-WKU-oai-digitalcommons.wku.edu-theses-4090
record_format oai_dc
spelling ndltd-WKU-oai-digitalcommons.wku.edu-theses-40902019-10-15T04:50:38Z Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero Prince, Jared In this work, the algorithm used by AlphaZero is adapted for dots and boxes, a two-player game. This algorithm is explored using different numbers of convolutional filters and training loops, in order to better understand the effect these parameters have on the learning of the player. Different board sizes are also tested to compare these parameters in relation to game complexity. AlphaZero originated as a Go player using an algorithm which combines Monte Carlo tree search and convolutional neural networks. This novel approach, integrating a reinforcement learning method previously applied to Go (MCTS) with a supervised learning method (neural networks) led to a player which beat all its competitors. 2018-10-01T07:00:00Z text application/pdf https://digitalcommons.wku.edu/theses/3087 https://digitalcommons.wku.edu/cgi/viewcontent.cgi?article=4090&context=theses Masters Theses & Specialist Projects TopSCHOLAR® Monte Carlo tree search neural network dots and boxes Other Computer Sciences Robotics Theory and Algorithms
collection NDLTD
format Others
sources NDLTD
topic Monte Carlo tree search
neural network
dots and boxes
Other Computer Sciences
Robotics
Theory and Algorithms
spellingShingle Monte Carlo tree search
neural network
dots and boxes
Other Computer Sciences
Robotics
Theory and Algorithms
Prince, Jared
Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero
description In this work, the algorithm used by AlphaZero is adapted for dots and boxes, a two-player game. This algorithm is explored using different numbers of convolutional filters and training loops, in order to better understand the effect these parameters have on the learning of the player. Different board sizes are also tested to compare these parameters in relation to game complexity. AlphaZero originated as a Go player using an algorithm which combines Monte Carlo tree search and convolutional neural networks. This novel approach, integrating a reinforcement learning method previously applied to Go (MCTS) with a supervised learning method (neural networks) led to a player which beat all its competitors.
author Prince, Jared
author_facet Prince, Jared
author_sort Prince, Jared
title Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero
title_short Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero
title_full Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero
title_fullStr Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero
title_full_unstemmed Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero
title_sort exploring the effect of different numbers of convolutional filters and training loops on the performance of alphazero
publisher TopSCHOLAR®
publishDate 2018
url https://digitalcommons.wku.edu/theses/3087
https://digitalcommons.wku.edu/cgi/viewcontent.cgi?article=4090&context=theses
work_keys_str_mv AT princejared exploringtheeffectofdifferentnumbersofconvolutionalfiltersandtrainingloopsontheperformanceofalphazero
_version_ 1719268790815948800