Deep Reinforcement Learning for Continuous Power Allocation in Flexible High Throughput Satellites

© 2019 IEEE. Many of the next generation of satellites will be equipped with numerous degrees of freedom in power and bandwidth allocation capabilities, making manual resource allocation impractical. Therefore, it is desirable to automate the operation of these highly flexible satellites. This paper...

Full description

Bibliographic Details
Main Authors: Luis, Juan Jose Garau (Author), Guerster, Markus (Author), del Portillo, Inigo (Author), Crawley, Edward (Author), Cameron, Bruce Gregory (Author)
Other Authors: Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor)
Format: Article
Language:English
Published: IEEE, 2021-11-22T18:47:23Z.
Subjects:
Online Access:Get fulltext
LEADER 01707 am a22002173u 4500
001 137285.2
042 |a dc 
100 1 0 |a Luis, Juan Jose Garau  |e author 
100 1 0 |a Massachusetts Institute of Technology. Department of Aeronautics and Astronautics  |e contributor 
700 1 0 |a Guerster, Markus  |e author 
700 1 0 |a del Portillo, Inigo  |e author 
700 1 0 |a Crawley, Edward  |e author 
700 1 0 |a Cameron, Bruce Gregory  |e author 
245 0 0 |a Deep Reinforcement Learning for Continuous Power Allocation in Flexible High Throughput Satellites 
260 |b IEEE,   |c 2021-11-22T18:47:23Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/137285.2 
520 |a © 2019 IEEE. Many of the next generation of satellites will be equipped with numerous degrees of freedom in power and bandwidth allocation capabilities, making manual resource allocation impractical. Therefore, it is desirable to automate the operation of these highly flexible satellites. This paper presents a novel approach based on Deep Reinforcement Learning to allocate power in multibeam satellite systems. The proposed architecture represents the problem as continuous state and action spaces. We make use of the Proximal Policy Optimization algorithm to optimize the allocation policy for minimum unmet system demand and power consumption. Finally, the performance of the algorithm is analyzed through simulations of a multibeam satellite system. The analysis shows promising results for Deep Reinforcement Learning to be used as a dynamic resource allocation algorithm. 
546 |a en 
655 7 |a Article 
773 |t 10.1109/ccaaw.2019.8904901 
773 |t 2019 IEEE Cognitive Communications for Aerospace Applications Workshop, CCAAW 2019