Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
Reservoir Computing (RC) is an emerging machine learning paradigm where a fixed kernel, built from a randomly connected "reservoir" with sufficiently rich dynamics, is capable of expanding the problem space in a non-linear fashion to a higher dimensional feature space. These features can t...
Main Author: | |
---|---|
Format: | Others |
Published: |
PDXScholar
2013
|
Subjects: | |
Online Access: | https://pdxscholar.library.pdx.edu/open_access_etds/669 https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1668&context=open_access_etds |
id |
ndltd-pdx.edu-oai-pdxscholar.library.pdx.edu-open_access_etds-1668 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-pdx.edu-oai-pdxscholar.library.pdx.edu-open_access_etds-16682019-10-20T04:35:36Z Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments Gargesa, Padmashri Reservoir Computing (RC) is an emerging machine learning paradigm where a fixed kernel, built from a randomly connected "reservoir" with sufficiently rich dynamics, is capable of expanding the problem space in a non-linear fashion to a higher dimensional feature space. These features can then be interpreted by a linear readout layer that is trained by a gradient descent method. In comparison to traditional neural networks, only the output layer needs to be trained, which leads to a significant computational advantage. In addition, the short term memory of the reservoir dynamics has the ability to transform a complex temporal input state space to a simple non-temporal representation. Adaptive real-time systems are multi-stage decision problems that can be used to train an agent to achieve a preset goal by performing an optimal action at each timestep. In such problems, the agent learns through continuous interactions with its environment. Conventional techniques to solving such problems become computationally expensive or may not converge if the state-space being considered is large, partially observable, or if short term memory is required in optimal decision making. The objective of this thesis is to use reservoir computers to solve such goal-driven tasks, where no error signal can be readily calculated to apply gradient descent methodologies. To address this challenge, we propose a novel reinforcement learning approach in combination with reservoir computers built from simple Boolean components. Such reservoirs are of interest because they have the potential to be fabricated by self-assembly techniques. We evaluate the performance of our approach in both Markovian and non-Markovian environments. We compare the performance of an agent trained through traditional Q-Learning. We find that the reservoir-based agent performs successfully in these problem contexts and even performs marginally better than Q-Learning agents in certain cases. Our proposed approach allows to retain the advantage of traditional parameterized dynamic systems in successfully modeling embedded state-space representations while eliminating the complexity involved in training traditional neural networks. To the best of our knowledge, our method of training a reservoir readout layer through an on-policy boot-strapping approach is unique in the field of random Boolean network reservoirs. 2013-03-27T07:00:00Z text application/pdf https://pdxscholar.library.pdx.edu/open_access_etds/669 https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1668&context=open_access_etds Dissertations and Theses PDXScholar Computer networks -- Technological innovations Context-aware computing Machine learning Dynamic programming Evolutionary computation Dynamics and Dynamical Systems Electrical and Computer Engineering |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Computer networks -- Technological innovations Context-aware computing Machine learning Dynamic programming Evolutionary computation Dynamics and Dynamical Systems Electrical and Computer Engineering |
spellingShingle |
Computer networks -- Technological innovations Context-aware computing Machine learning Dynamic programming Evolutionary computation Dynamics and Dynamical Systems Electrical and Computer Engineering Gargesa, Padmashri Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments |
description |
Reservoir Computing (RC) is an emerging machine learning paradigm where a fixed kernel, built from a randomly connected "reservoir" with sufficiently rich dynamics, is capable of expanding the problem space in a non-linear fashion to a higher dimensional feature space. These features can then be interpreted by a linear readout layer that is trained by a gradient descent method. In comparison to traditional neural networks, only the output layer needs to be trained, which leads to a significant computational advantage. In addition, the short term memory of the reservoir dynamics has the ability to transform a complex temporal input state space to a simple non-temporal representation. Adaptive real-time systems are multi-stage decision problems that can be used to train an agent to achieve a preset goal by performing an optimal action at each timestep. In such problems, the agent learns through continuous interactions with its environment. Conventional techniques to solving such problems become computationally expensive or may not converge if the state-space being considered is large, partially observable, or if short term memory is required in optimal decision making. The objective of this thesis is to use reservoir computers to solve such goal-driven tasks, where no error signal can be readily calculated to apply gradient descent methodologies. To address this challenge, we propose a novel reinforcement learning approach in combination with reservoir computers built from simple Boolean components. Such reservoirs are of interest because they have the potential to be fabricated by self-assembly techniques. We evaluate the performance of our approach in both Markovian and non-Markovian environments. We compare the performance of an agent trained through traditional Q-Learning. We find that the reservoir-based agent performs successfully in these problem contexts and even performs marginally better than Q-Learning agents in certain cases. Our proposed approach allows to retain the advantage of traditional parameterized dynamic systems in successfully modeling embedded state-space representations while eliminating the complexity involved in training traditional neural networks. To the best of our knowledge, our method of training a reservoir readout layer through an on-policy boot-strapping approach is unique in the field of random Boolean network reservoirs. |
author |
Gargesa, Padmashri |
author_facet |
Gargesa, Padmashri |
author_sort |
Gargesa, Padmashri |
title |
Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments |
title_short |
Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments |
title_full |
Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments |
title_fullStr |
Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments |
title_full_unstemmed |
Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments |
title_sort |
reward-driven training of random boolean network reservoirs for model-free environments |
publisher |
PDXScholar |
publishDate |
2013 |
url |
https://pdxscholar.library.pdx.edu/open_access_etds/669 https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1668&context=open_access_etds |
work_keys_str_mv |
AT gargesapadmashri rewarddriventrainingofrandombooleannetworkreservoirsformodelfreeenvironments |
_version_ |
1719271163053473792 |