Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments

Reservoir Computing (RC) is an emerging machine learning paradigm where a fixed kernel, built from a randomly connected "reservoir" with sufficiently rich dynamics, is capable of expanding the problem space in a non-linear fashion to a higher dimensional feature space. These features can t...

Full description

Bibliographic Details
Main Author: Gargesa, Padmashri
Format: Others
Published: PDXScholar 2013
Subjects:
Online Access:https://pdxscholar.library.pdx.edu/open_access_etds/669
https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1668&context=open_access_etds
id ndltd-pdx.edu-oai-pdxscholar.library.pdx.edu-open_access_etds-1668
record_format oai_dc
spelling ndltd-pdx.edu-oai-pdxscholar.library.pdx.edu-open_access_etds-16682019-10-20T04:35:36Z Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments Gargesa, Padmashri Reservoir Computing (RC) is an emerging machine learning paradigm where a fixed kernel, built from a randomly connected "reservoir" with sufficiently rich dynamics, is capable of expanding the problem space in a non-linear fashion to a higher dimensional feature space. These features can then be interpreted by a linear readout layer that is trained by a gradient descent method. In comparison to traditional neural networks, only the output layer needs to be trained, which leads to a significant computational advantage. In addition, the short term memory of the reservoir dynamics has the ability to transform a complex temporal input state space to a simple non-temporal representation. Adaptive real-time systems are multi-stage decision problems that can be used to train an agent to achieve a preset goal by performing an optimal action at each timestep. In such problems, the agent learns through continuous interactions with its environment. Conventional techniques to solving such problems become computationally expensive or may not converge if the state-space being considered is large, partially observable, or if short term memory is required in optimal decision making. The objective of this thesis is to use reservoir computers to solve such goal-driven tasks, where no error signal can be readily calculated to apply gradient descent methodologies. To address this challenge, we propose a novel reinforcement learning approach in combination with reservoir computers built from simple Boolean components. Such reservoirs are of interest because they have the potential to be fabricated by self-assembly techniques. We evaluate the performance of our approach in both Markovian and non-Markovian environments. We compare the performance of an agent trained through traditional Q-Learning. We find that the reservoir-based agent performs successfully in these problem contexts and even performs marginally better than Q-Learning agents in certain cases. Our proposed approach allows to retain the advantage of traditional parameterized dynamic systems in successfully modeling embedded state-space representations while eliminating the complexity involved in training traditional neural networks. To the best of our knowledge, our method of training a reservoir readout layer through an on-policy boot-strapping approach is unique in the field of random Boolean network reservoirs. 2013-03-27T07:00:00Z text application/pdf https://pdxscholar.library.pdx.edu/open_access_etds/669 https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1668&context=open_access_etds Dissertations and Theses PDXScholar Computer networks -- Technological innovations Context-aware computing Machine learning Dynamic programming Evolutionary computation Dynamics and Dynamical Systems Electrical and Computer Engineering
collection NDLTD
format Others
sources NDLTD
topic Computer networks -- Technological innovations
Context-aware computing
Machine learning
Dynamic programming
Evolutionary computation
Dynamics and Dynamical Systems
Electrical and Computer Engineering
spellingShingle Computer networks -- Technological innovations
Context-aware computing
Machine learning
Dynamic programming
Evolutionary computation
Dynamics and Dynamical Systems
Electrical and Computer Engineering
Gargesa, Padmashri
Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
description Reservoir Computing (RC) is an emerging machine learning paradigm where a fixed kernel, built from a randomly connected "reservoir" with sufficiently rich dynamics, is capable of expanding the problem space in a non-linear fashion to a higher dimensional feature space. These features can then be interpreted by a linear readout layer that is trained by a gradient descent method. In comparison to traditional neural networks, only the output layer needs to be trained, which leads to a significant computational advantage. In addition, the short term memory of the reservoir dynamics has the ability to transform a complex temporal input state space to a simple non-temporal representation. Adaptive real-time systems are multi-stage decision problems that can be used to train an agent to achieve a preset goal by performing an optimal action at each timestep. In such problems, the agent learns through continuous interactions with its environment. Conventional techniques to solving such problems become computationally expensive or may not converge if the state-space being considered is large, partially observable, or if short term memory is required in optimal decision making. The objective of this thesis is to use reservoir computers to solve such goal-driven tasks, where no error signal can be readily calculated to apply gradient descent methodologies. To address this challenge, we propose a novel reinforcement learning approach in combination with reservoir computers built from simple Boolean components. Such reservoirs are of interest because they have the potential to be fabricated by self-assembly techniques. We evaluate the performance of our approach in both Markovian and non-Markovian environments. We compare the performance of an agent trained through traditional Q-Learning. We find that the reservoir-based agent performs successfully in these problem contexts and even performs marginally better than Q-Learning agents in certain cases. Our proposed approach allows to retain the advantage of traditional parameterized dynamic systems in successfully modeling embedded state-space representations while eliminating the complexity involved in training traditional neural networks. To the best of our knowledge, our method of training a reservoir readout layer through an on-policy boot-strapping approach is unique in the field of random Boolean network reservoirs.
author Gargesa, Padmashri
author_facet Gargesa, Padmashri
author_sort Gargesa, Padmashri
title Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
title_short Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
title_full Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
title_fullStr Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
title_full_unstemmed Reward-driven Training of Random Boolean Network Reservoirs for Model-Free Environments
title_sort reward-driven training of random boolean network reservoirs for model-free environments
publisher PDXScholar
publishDate 2013
url https://pdxscholar.library.pdx.edu/open_access_etds/669
https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1668&context=open_access_etds
work_keys_str_mv AT gargesapadmashri rewarddriventrainingofrandombooleannetworkreservoirsformodelfreeenvironments
_version_ 1719271163053473792