Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization

With the rapid development of Industrial 4.0, the modern manufacturing system has been experiencing profoundly digital transformation. The development of new technologies helps to improve the efficiency of production and the quality of products. However, for the increasingly complex production syste...

Full description

Bibliographic Details
Main Authors: Amaitik, N. (Author), Hu, Y. (Author), Lu, Y. (Author), Xu, Y. (Author), Zhang, M. (Author)
Format: Article
Language:English
Published: MDPI 2022
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02368nam a2200325Ia 4500
001 10.3390-su14095177
008 220517s2022 CNT 000 0 und d
020 |a 20711050 (ISSN) 
245 1 0 |a Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization 
260 0 |b MDPI  |c 2022 
856 |z View Fulltext in Publisher  |u https://doi.org/10.3390/su14095177 
520 3 |a With the rapid development of Industrial 4.0, the modern manufacturing system has been experiencing profoundly digital transformation. The development of new technologies helps to improve the efficiency of production and the quality of products. However, for the increasingly complex production systems, operational decision making encounters more challenges in terms of having sustainable manufacturing to satisfy customers and markets’ rapidly changing demands. Nowadays, rule-based heuristic approaches are widely used for scheduling management in production systems, which, however, significantly depends on the expert domain knowledge. In this way, the efficiency of decision making could not be guaranteed nor meet the dynamic scheduling requirement in the job-shop manufacturing environment. In this study, we propose using deep reinforcement learning (DRL) methods to tackle the dynamic scheduling problem in the job-shop manufacturing system with unexpected machine failure. The proximal policy optimization (PPO) algorithm was used in the DRL framework to accelerate the learning process and improve performance. The proposed method was testified within a real-world dynamic production environment, and it performs better compared with the state-of-the-art methods. © 2022 by the authors. Licensee MDPI, Basel, Switzerland. 
650 0 4 |a artificial neural network 
650 0 4 |a Artificial neural networks 
650 0 4 |a decision making 
650 0 4 |a Deep reinforcement learning 
650 0 4 |a Dynamic scheduling 
650 0 4 |a Industry 4.0 
650 0 4 |a learning 
650 0 4 |a manufacturing 
650 0 4 |a Manufacturing sustainability 
650 0 4 |a optimization 
650 0 4 |a policy approach 
650 0 4 |a sustainability 
700 1 |a Amaitik, N.  |e author 
700 1 |a Hu, Y.  |e author 
700 1 |a Lu, Y.  |e author 
700 1 |a Xu, Y.  |e author 
700 1 |a Zhang, M.  |e author 
773 |t Sustainability (Switzerland)