Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots

Due to the decentralized, loosely coupled nature of a swarm and to the lack of a general design methodology, the development of control software for robot swarms is typically an iterative process. Control software is generally modified and refined repeatedly, either manually or automatically, until...

Full description

Bibliographic Details
Main Authors:	Federico Pagnozzi, Mauro Birattari
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2021-04-01
Series:	Frontiers in Robotics and AI
Subjects:	swarm robotics control software architecture automatic design reinforcement learning importance sampling
Online Access:	https://www.frontiersin.org/articles/10.3389/frobt.2021.625125/full

id	doaj-6d8241e285bc49b08713e63da657d475
record_format	Article
spelling	doaj-6d8241e285bc49b08713e63da657d4752021-04-29T11:06:41ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442021-04-01810.3389/frobt.2021.625125625125Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the RobotsFederico PagnozziMauro BirattariDue to the decentralized, loosely coupled nature of a swarm and to the lack of a general design methodology, the development of control software for robot swarms is typically an iterative process. Control software is generally modified and refined repeatedly, either manually or automatically, until satisfactory results are obtained. In this paper, we propose a technique based on off-policy evaluation to estimate how the performance of an instance of control software—implemented as a probabilistic finite-state machine—would be impacted by modifying the structure and the value of the parameters. The proposed technique is particularly appealing when coupled with automatic design methods belonging to the AutoMoDe family, as it can exploit the data generated during the design process. The technique can be used either to reduce the complexity of the control software generated, improving therefore its readability, or to evaluate perturbations of the parameters, which could help in prioritizing the exploration of the neighborhood of the current solution within an iterative improvement algorithm. To evaluate the technique, we apply it to control software generated with an AutoMoDe method, Chocolate−6S . In a first experiment, we use the proposed technique to estimate the impact of removing a state from a probabilistic finite-state machine. In a second experiment, we use it to predict the impact of changing the value of the parameters. The results show that the technique is promising and significantly better than a naive estimation. We discuss the limitations of the current implementation of the technique, and we sketch possible improvements, extensions, and generalizations.https://www.frontiersin.org/articles/10.3389/frobt.2021.625125/fullswarm roboticscontrol software architectureautomatic designreinforcement learningimportance sampling
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Federico Pagnozzi Mauro Birattari
spellingShingle	Federico Pagnozzi Mauro Birattari Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots Frontiers in Robotics and AI swarm robotics control software architecture automatic design reinforcement learning importance sampling
author_facet	Federico Pagnozzi Mauro Birattari
author_sort	Federico Pagnozzi
title	Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots
title_short	Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots
title_full	Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots
title_fullStr	Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots
title_full_unstemmed	Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots
title_sort	off-policy evaluation of the performance of a robot swarm: importance sampling to assess potential modifications to the finite-state machine that controls the robots
publisher	Frontiers Media S.A.
series	Frontiers in Robotics and AI
issn	2296-9144
publishDate	2021-04-01
description	Due to the decentralized, loosely coupled nature of a swarm and to the lack of a general design methodology, the development of control software for robot swarms is typically an iterative process. Control software is generally modified and refined repeatedly, either manually or automatically, until satisfactory results are obtained. In this paper, we propose a technique based on off-policy evaluation to estimate how the performance of an instance of control software—implemented as a probabilistic finite-state machine—would be impacted by modifying the structure and the value of the parameters. The proposed technique is particularly appealing when coupled with automatic design methods belonging to the AutoMoDe family, as it can exploit the data generated during the design process. The technique can be used either to reduce the complexity of the control software generated, improving therefore its readability, or to evaluate perturbations of the parameters, which could help in prioritizing the exploration of the neighborhood of the current solution within an iterative improvement algorithm. To evaluate the technique, we apply it to control software generated with an AutoMoDe method, Chocolate−6S . In a first experiment, we use the proposed technique to estimate the impact of removing a state from a probabilistic finite-state machine. In a second experiment, we use it to predict the impact of changing the value of the parameters. The results show that the technique is promising and significantly better than a naive estimation. We discuss the limitations of the current implementation of the technique, and we sketch possible improvements, extensions, and generalizations.
topic	swarm robotics control software architecture automatic design reinforcement learning importance sampling
url	https://www.frontiersin.org/articles/10.3389/frobt.2021.625125/full
work_keys_str_mv	AT federicopagnozzi offpolicyevaluationoftheperformanceofarobotswarmimportancesamplingtoassesspotentialmodificationstothefinitestatemachinethatcontrolstherobots AT maurobirattari offpolicyevaluationoftheperformanceofarobotswarmimportancesamplingtoassesspotentialmodificationstothefinitestatemachinethatcontrolstherobots
_version_	1721501282757246976

Off-Policy Evaluation of the Performance of a Robot Swarm: Importance Sampling to Assess Potential Modifications to the Finite-State Machine That Controls the Robots

Similar Items