A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning

This paper addresses a new machine learning-based behavioral strategy using the deep Q-learning algorithm for the RoboCode simulation platform. According to this strategy, a new model is proposed for the RoboCode platform, providing an environment for simulated robots that can be programmed to battl...

Full description

Bibliographic Details
Main Authors: Hakan Kayakoku, Mehmet Serdar Guzel, Erkan Bostanci, Ihsan Tolga Medeni, Deepti Mishra
Format: Article
Language:English
Published: Hindawi-Wiley 2021-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2021/9963018
id doaj-94aea93d9507475ab27b534cbd042d19
record_format Article
spelling doaj-94aea93d9507475ab27b534cbd042d192021-07-26T00:34:12ZengHindawi-WileyComplexity1099-05262021-01-01202110.1155/2021/9963018A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-LearningHakan Kayakoku0Mehmet Serdar Guzel1Erkan Bostanci2Ihsan Tolga Medeni3Deepti Mishra4Aselsan CompanyRobotics LaboratorySAAT LaboratoryAnkara Yildirim Beyazit University (AYBU)Department of Computer Science (IDI)This paper addresses a new machine learning-based behavioral strategy using the deep Q-learning algorithm for the RoboCode simulation platform. According to this strategy, a new model is proposed for the RoboCode platform, providing an environment for simulated robots that can be programmed to battle against other robots. Compared to Atari Games, RoboCode has a fairly wide set of actions and situations. Due to the challenges of training a CNN model for such a continuous action space problem, the inputs obtained from the simulation environment were generated dynamically, and the proposed model was trained by using these inputs. The trained model battled against the predefined rival robots of the environment (standard robots) by cumulatively benefiting from the experience of these robots. The comparison between the proposed model and standard robots of RoboCode Platform was statistically verified. Finally, the performance of the proposed model was compared with machine learning based-customized robots (community robots). Experimental results reveal that the proposed model is mostly superior to community robots. Therefore, the deep Q-learning-based model has proven to be successful in such a complex simulation environment. It should also be noted that this new model facilitates simulation performance in adaptive and partially cluttered environments.http://dx.doi.org/10.1155/2021/9963018
collection DOAJ
language English
format Article
sources DOAJ
author Hakan Kayakoku
Mehmet Serdar Guzel
Erkan Bostanci
Ihsan Tolga Medeni
Deepti Mishra
spellingShingle Hakan Kayakoku
Mehmet Serdar Guzel
Erkan Bostanci
Ihsan Tolga Medeni
Deepti Mishra
A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
Complexity
author_facet Hakan Kayakoku
Mehmet Serdar Guzel
Erkan Bostanci
Ihsan Tolga Medeni
Deepti Mishra
author_sort Hakan Kayakoku
title A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
title_short A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
title_full A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
title_fullStr A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
title_full_unstemmed A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
title_sort novel behavioral strategy for robocode platform based on deep q-learning
publisher Hindawi-Wiley
series Complexity
issn 1099-0526
publishDate 2021-01-01
description This paper addresses a new machine learning-based behavioral strategy using the deep Q-learning algorithm for the RoboCode simulation platform. According to this strategy, a new model is proposed for the RoboCode platform, providing an environment for simulated robots that can be programmed to battle against other robots. Compared to Atari Games, RoboCode has a fairly wide set of actions and situations. Due to the challenges of training a CNN model for such a continuous action space problem, the inputs obtained from the simulation environment were generated dynamically, and the proposed model was trained by using these inputs. The trained model battled against the predefined rival robots of the environment (standard robots) by cumulatively benefiting from the experience of these robots. The comparison between the proposed model and standard robots of RoboCode Platform was statistically verified. Finally, the performance of the proposed model was compared with machine learning based-customized robots (community robots). Experimental results reveal that the proposed model is mostly superior to community robots. Therefore, the deep Q-learning-based model has proven to be successful in such a complex simulation environment. It should also be noted that this new model facilitates simulation performance in adaptive and partially cluttered environments.
url http://dx.doi.org/10.1155/2021/9963018
work_keys_str_mv AT hakankayakoku anovelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT mehmetserdarguzel anovelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT erkanbostanci anovelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT ihsantolgamedeni anovelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT deeptimishra anovelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT hakankayakoku novelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT mehmetserdarguzel novelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT erkanbostanci novelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT ihsantolgamedeni novelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
AT deeptimishra novelbehavioralstrategyforrobocodeplatformbasedondeepqlearning
_version_ 1721282447909322752