Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot

One of the critical challenges in deploying the cleaning robots is the completion of covering the entire area. Current tiling robots for area coverage have fixed forms and are limited to cleaning only certain areas. The reconfigurable system is the creative answer to such an optimal coverage problem...

Full description

Bibliographic Details
Main Authors:	Anh Vu Le, Prabakaran Veerajagadheswar, Phone Thiha Kyaw, Mohan Rajesh Elara, Nguyen Huu Khanh Nhan
Format:	Article
Language:	English
Published:	MDPI AG 2021-04-01
Series:	Sensors
Subjects:	reconfigurable system tiling robotic reinforcement learning TSP, complete path planning energy-aware reward function
Online Access:	https://www.mdpi.com/1424-8220/21/8/2577

id	doaj-79a71abff4a740a1a2f2b63ae1d6c3a2
record_format	Article
spelling	doaj-79a71abff4a740a1a2f2b63ae1d6c3a22021-04-07T23:01:49ZengMDPI AGSensors1424-82202021-04-01212577257710.3390/s21082577Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling RobotAnh Vu Le0Prabakaran Veerajagadheswar1Phone Thiha Kyaw2Mohan Rajesh Elara3Nguyen Huu Khanh Nhan4ROAR Lab, Engineering Product Development, Singapore University of Technology and Design, Singapore 487372, SingaporeROAR Lab, Engineering Product Development, Singapore University of Technology and Design, Singapore 487372, SingaporeDepartment of Mechatronic Engineering, Yangon Technological University, Insein 11101, MyanmarROAR Lab, Engineering Product Development, Singapore University of Technology and Design, Singapore 487372, SingaporeOptoelectronics Research Group, Faculty of Electrical and Electronics Engineering, Ton Duc Thang University, Ho Chi Minh City 700000, VietnamOne of the critical challenges in deploying the cleaning robots is the completion of covering the entire area. Current tiling robots for area coverage have fixed forms and are limited to cleaning only certain areas. The reconfigurable system is the creative answer to such an optimal coverage problem. The tiling robot’s goal enables the complete coverage of the entire area by reconfiguring to different shapes according to the area’s needs. In the particular sequencing of navigation, it is essential to have a structure that allows the robot to extend the coverage range while saving energy usage during navigation. This implies that the robot is able to cover larger areas entirely with the least required actions. This paper presents a complete path planning (CPP) for hTetran, a polyabolo tiled robot, based on a TSP-based reinforcement learning optimization. This structure simultaneously produces robot shapes and sequential trajectories whilst maximizing the reward of the trained reinforcement learning (RL) model within the predefined polyabolo-based tileset. To this end, a reinforcement learning-based travel sales problem (TSP) with proximal policy optimization (PPO) algorithm was trained using the complementary learning computation of the TSP sequencing. The reconstructive results of the proposed RL-TSP-based CPP for hTetran were compared in terms of energy and time spent with the conventional tiled hypothetical models that incorporate TSP solved through an evolutionary based ant colony optimization (ACO) approach. The CPP demonstrates an ability to generate an ideal Pareto optima trajectory that enhances the robot’s navigation inside the real environment with the least energy and time spent in the company of conventional techniques.https://www.mdpi.com/1424-8220/21/8/2577reconfigurable systemtiling roboticreinforcement learning TSP, complete path planningenergy-aware reward function
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Anh Vu Le Prabakaran Veerajagadheswar Phone Thiha Kyaw Mohan Rajesh Elara Nguyen Huu Khanh Nhan
spellingShingle	Anh Vu Le Prabakaran Veerajagadheswar Phone Thiha Kyaw Mohan Rajesh Elara Nguyen Huu Khanh Nhan Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot Sensors reconfigurable system tiling robotic reinforcement learning TSP, complete path planning energy-aware reward function
author_facet	Anh Vu Le Prabakaran Veerajagadheswar Phone Thiha Kyaw Mohan Rajesh Elara Nguyen Huu Khanh Nhan
author_sort	Anh Vu Le
title	Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot
title_short	Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot
title_full	Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot
title_fullStr	Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot
title_full_unstemmed	Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot
title_sort	coverage path planning using reinforcement learning-based tsp for htetran—a polyabolo-inspired self-reconfigurable tiling robot
publisher	MDPI AG
series	Sensors
issn	1424-8220
publishDate	2021-04-01
description	One of the critical challenges in deploying the cleaning robots is the completion of covering the entire area. Current tiling robots for area coverage have fixed forms and are limited to cleaning only certain areas. The reconfigurable system is the creative answer to such an optimal coverage problem. The tiling robot’s goal enables the complete coverage of the entire area by reconfiguring to different shapes according to the area’s needs. In the particular sequencing of navigation, it is essential to have a structure that allows the robot to extend the coverage range while saving energy usage during navigation. This implies that the robot is able to cover larger areas entirely with the least required actions. This paper presents a complete path planning (CPP) for hTetran, a polyabolo tiled robot, based on a TSP-based reinforcement learning optimization. This structure simultaneously produces robot shapes and sequential trajectories whilst maximizing the reward of the trained reinforcement learning (RL) model within the predefined polyabolo-based tileset. To this end, a reinforcement learning-based travel sales problem (TSP) with proximal policy optimization (PPO) algorithm was trained using the complementary learning computation of the TSP sequencing. The reconstructive results of the proposed RL-TSP-based CPP for hTetran were compared in terms of energy and time spent with the conventional tiled hypothetical models that incorporate TSP solved through an evolutionary based ant colony optimization (ACO) approach. The CPP demonstrates an ability to generate an ideal Pareto optima trajectory that enhances the robot’s navigation inside the real environment with the least energy and time spent in the company of conventional techniques.
topic	reconfigurable system tiling robotic reinforcement learning TSP, complete path planning energy-aware reward function
url	https://www.mdpi.com/1424-8220/21/8/2577
work_keys_str_mv	AT anhvule coveragepathplanningusingreinforcementlearningbasedtspforhtetranapolyaboloinspiredselfreconfigurabletilingrobot AT prabakaranveerajagadheswar coveragepathplanningusingreinforcementlearningbasedtspforhtetranapolyaboloinspiredselfreconfigurabletilingrobot AT phonethihakyaw coveragepathplanningusingreinforcementlearningbasedtspforhtetranapolyaboloinspiredselfreconfigurabletilingrobot AT mohanrajeshelara coveragepathplanningusingreinforcementlearningbasedtspforhtetranapolyaboloinspiredselfreconfigurabletilingrobot AT nguyenhuukhanhnhan coveragepathplanningusingreinforcementlearningbasedtspforhtetranapolyaboloinspiredselfreconfigurabletilingrobot
_version_	1721535748388159488

Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot

Similar Items