Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems

Decision-making is the mechanism of using available information to develop solutions to given problems by forming preferences, beliefs, or selecting courses of action amongst several alternatives. It is the main focus of a variety of scientific fields such as robotics, finances, and neuroscience. In...

Full description

Bibliographic Details
Main Author: de Miranda de Matos Lourenço, Inês
Format: Others
Language:English
Published: KTH, Reglerteknik 2021
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295301
http://nbn-resolving.de/urn:isbn:978-91-7873-884-7
id ndltd-UPSALLA1-oai-DiVA.org-kth-295301
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-kth-2953012021-05-22T05:23:52ZForward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systemsengde Miranda de Matos Lourenço, InêsKTH, ReglerteknikStockholm2021Control EngineeringReglerteknikDecision-making is the mechanism of using available information to develop solutions to given problems by forming preferences, beliefs, or selecting courses of action amongst several alternatives. It is the main focus of a variety of scientific fields such as robotics, finances, and neuroscience. In this thesis, we study the mechanisms that generate behavior in diverse decision-making settings (the forward problem) and how their characteristics can explain observed behavior (the inverse problem). Both problems take a central role in current research due to the desire to understand the features of system behavior, many times under situations of risk and uncertainty. We study decision-making problems in the three following settings. In the first setting, we consider a decision-maker who forms a private belief (posterior distribution) on the state of the world by filtering private information. Estimating private beliefs is a way to understand what drives decisions. This forms a foundation for predicting, and counteracting against, future actions. In the setting of adversarial systems, we answer the problems of i) how can an adversary estimate the private belief of the decision-maker by observing its decisions (under two different scenarios), and ii) how can the decision-maker protect its private belief by confusing the adversary. We exemplify the applicability of our frameworks in regime-switching Markovian portfolio allocation. In the second setting we shift from an adversarial to a cooperative scenario. We consider a teacher-student framework similar to that used in learning from demonstration and transfer learning setups. An expert agent (teacher) knows the model of a system and wants to assist a learner agent (student) in performing identification for that system but cannot directly transfer its knowledge to the student. For example, the teacher's knowledge of the system might be abstract or the teacher and student might be employing different model classes, which renders the teacher's parameters uninformative to the student. We propose correctional learning as an approach where, in order to assist the student, the teacher can intercept the observations collected from the system and modify them to maximize the amount of information the student receives about the system. We obtain finite-sample results for correctional learning of binomial systems. In the third and final setting we shift our attention to cognitive science and decision-making of biological systems, to obtain insight about the intrinsic characteristics of these systems. We focus on time perception - how humans and animals perceive the passage of time, and solve the forward problem by designing a biologically-inspired decision-making framework that replicates the mechanisms responsible for time perception. We conclude that a simulated robot equipped with our framework is able to perceive time similarly to animals - when it comes to their intrinsic mechanisms of interpreting time and performing time-aware actions. We then focus on the inverse problem. Based on the empirical action probability distribution of the agent, we are able to estimate the parameters it uses for perceiving time. Our work shows promising results when it comes to drawing conclusions regarding some of the characteristics present in biological timing mechanisms. <p>QC 20210521</p>Licentiate thesis, monographinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295301urn:isbn:978-91-7873-884-7TRITA-EECS-AVL ; 2021:34application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Control Engineering
Reglerteknik
spellingShingle Control Engineering
Reglerteknik
de Miranda de Matos Lourenço, Inês
Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
description Decision-making is the mechanism of using available information to develop solutions to given problems by forming preferences, beliefs, or selecting courses of action amongst several alternatives. It is the main focus of a variety of scientific fields such as robotics, finances, and neuroscience. In this thesis, we study the mechanisms that generate behavior in diverse decision-making settings (the forward problem) and how their characteristics can explain observed behavior (the inverse problem). Both problems take a central role in current research due to the desire to understand the features of system behavior, many times under situations of risk and uncertainty. We study decision-making problems in the three following settings. In the first setting, we consider a decision-maker who forms a private belief (posterior distribution) on the state of the world by filtering private information. Estimating private beliefs is a way to understand what drives decisions. This forms a foundation for predicting, and counteracting against, future actions. In the setting of adversarial systems, we answer the problems of i) how can an adversary estimate the private belief of the decision-maker by observing its decisions (under two different scenarios), and ii) how can the decision-maker protect its private belief by confusing the adversary. We exemplify the applicability of our frameworks in regime-switching Markovian portfolio allocation. In the second setting we shift from an adversarial to a cooperative scenario. We consider a teacher-student framework similar to that used in learning from demonstration and transfer learning setups. An expert agent (teacher) knows the model of a system and wants to assist a learner agent (student) in performing identification for that system but cannot directly transfer its knowledge to the student. For example, the teacher's knowledge of the system might be abstract or the teacher and student might be employing different model classes, which renders the teacher's parameters uninformative to the student. We propose correctional learning as an approach where, in order to assist the student, the teacher can intercept the observations collected from the system and modify them to maximize the amount of information the student receives about the system. We obtain finite-sample results for correctional learning of binomial systems. In the third and final setting we shift our attention to cognitive science and decision-making of biological systems, to obtain insight about the intrinsic characteristics of these systems. We focus on time perception - how humans and animals perceive the passage of time, and solve the forward problem by designing a biologically-inspired decision-making framework that replicates the mechanisms responsible for time perception. We conclude that a simulated robot equipped with our framework is able to perceive time similarly to animals - when it comes to their intrinsic mechanisms of interpreting time and performing time-aware actions. We then focus on the inverse problem. Based on the empirical action probability distribution of the agent, we are able to estimate the parameters it uses for perceiving time. Our work shows promising results when it comes to drawing conclusions regarding some of the characteristics present in biological timing mechanisms. === <p>QC 20210521</p>
author de Miranda de Matos Lourenço, Inês
author_facet de Miranda de Matos Lourenço, Inês
author_sort de Miranda de Matos Lourenço, Inês
title Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
title_short Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
title_full Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
title_fullStr Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
title_full_unstemmed Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems
title_sort forward and inverse decision-making in adversarial, cooperative, and biologically-inspired dynamical systems
publisher KTH, Reglerteknik
publishDate 2021
url http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295301
http://nbn-resolving.de/urn:isbn:978-91-7873-884-7
work_keys_str_mv AT demirandadematoslourencoines forwardandinversedecisionmakinginadversarialcooperativeandbiologicallyinspireddynamicalsystems
_version_ 1719405413743460352