Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner

The largest project at the AICG lab at Linköping University, Cognitive models for virtual characters, focuses on creating an agent architecture for intelligent, virtual characters. The goal is to create an agent that acts naturally and gives a realistic user experience. The purpose of this thesis is...

Full description

Bibliographic Details
Main Authors:	Wahlström, Jonathan, Djupfeldt, Oscar
Format:	Others
Language:	Swedish
Published:	Linköpings universitet, Institutionen för teknik och naturvetenskap 2010
Subjects:	learning emotional behavior networks online reinforcement unsupervised AI inlärning beteendenätverk förstärkt lärande Computer Sciences Datavetenskap (datalogi) Other Computer and Information Science Annan data- och informationsvetenskap
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-54442

id	ndltd-UPSALLA1-oai-DiVA.org-liu-54442
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-liu-544422018-01-13T05:16:30ZInlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domänersweLearning in Emotional Behavior Networks : Online Unsupervised Reinforcement Learning in Continuous DomainsWahlström, JonathanDjupfeldt, OscarLinköpings universitet, Institutionen för teknik och naturvetenskapLinköpings universitet, Institutionen för teknik och naturvetenskap2010learningemotionalbehavior networksonlinereinforcementunsupervisedAIinlärningbeteendenätverkförstärkt lärandeAIComputer SciencesDatavetenskap (datalogi)Other Computer and Information ScienceAnnan data- och informationsvetenskapThe largest project at the AICG lab at Linköping University, Cognitive models for virtual characters, focuses on creating an agent architecture for intelligent, virtual characters. The goal is to create an agent that acts naturally and gives a realistic user experience. The purpose of this thesis is to develop and implement an appropriate learning model that fits the existing agent architecture using an agile project methodology. The model developed can be seen as an online unsupervised reinforcement learning model that enhances experiences through reward. The model is based on Maes model where new effects are created depending on whether the agent is fulfilling its goals or not. The model we have developed is based on constant monitoring of the system. If an action is chosen it is saved in a short-term memory. The memory is constantly updated with current information about the environment and the agent’s state. These memories will be evaluated on the basis of user defined classes that define what all values must satisfy to be successful. If the last memory in the list is considered to be evaluated it will be saved in a long-term memory. This long-term memory works all the time as a basis for how theagent’s network is structured. The long term memory is filtered based on where the agent is, how it feels and its current state. Our model is evaluated in a series of tests where the agent's ability to adapt and how repetitive the agent is, is tested. In reality, an agent with learning will get a dynamic network based on input from the user, but after a short period it may look completely different, depending on the amount of situations experienced by the agent and where it has been. An agent will have one network structure in the vicinity of food at location x and a completely different structure at anenemy at location y. If the agent enters a new situation where past experience does notfavor the agent, it will explore all possible actions it can take and thus creating newexperiences. A comparison with an implementation without classification and learning indicates that the user needs to create fewer classes than it otherwise needs to create effects to cover all possible combinations. <img src="http://www.diva-portal.org/cgi-bin/mimetex.cgi?K_%7Bs%7D+K_%7Bb%7D" /><img src="http://www.diva-portal.org/cgi-bin/mimetex.cgi?K" />KS+KB classes creates effects for S*B state/behavior combinations, where KS and KB is the number of state classes and behavior classes and S and B is the number of states and behaviors in the network. Cognitive models for virtual charactersStudent thesisinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-54442application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	Swedish
format	Others
sources	NDLTD
topic	learning emotional behavior networks online reinforcement unsupervised AI inlärning beteendenätverk förstärkt lärande AI Computer Sciences Datavetenskap (datalogi) Other Computer and Information Science Annan data- och informationsvetenskap
spellingShingle	learning emotional behavior networks online reinforcement unsupervised AI inlärning beteendenätverk förstärkt lärande AI Computer Sciences Datavetenskap (datalogi) Other Computer and Information Science Annan data- och informationsvetenskap Wahlström, Jonathan Djupfeldt, Oscar Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner
description	The largest project at the AICG lab at Linköping University, Cognitive models for virtual characters, focuses on creating an agent architecture for intelligent, virtual characters. The goal is to create an agent that acts naturally and gives a realistic user experience. The purpose of this thesis is to develop and implement an appropriate learning model that fits the existing agent architecture using an agile project methodology. The model developed can be seen as an online unsupervised reinforcement learning model that enhances experiences through reward. The model is based on Maes model where new effects are created depending on whether the agent is fulfilling its goals or not. The model we have developed is based on constant monitoring of the system. If an action is chosen it is saved in a short-term memory. The memory is constantly updated with current information about the environment and the agent’s state. These memories will be evaluated on the basis of user defined classes that define what all values must satisfy to be successful. If the last memory in the list is considered to be evaluated it will be saved in a long-term memory. This long-term memory works all the time as a basis for how theagent’s network is structured. The long term memory is filtered based on where the agent is, how it feels and its current state. Our model is evaluated in a series of tests where the agent's ability to adapt and how repetitive the agent is, is tested. In reality, an agent with learning will get a dynamic network based on input from the user, but after a short period it may look completely different, depending on the amount of situations experienced by the agent and where it has been. An agent will have one network structure in the vicinity of food at location x and a completely different structure at anenemy at location y. If the agent enters a new situation where past experience does notfavor the agent, it will explore all possible actions it can take and thus creating newexperiences. A comparison with an implementation without classification and learning indicates that the user needs to create fewer classes than it otherwise needs to create effects to cover all possible combinations. <img src="http://www.diva-portal.org/cgi-bin/mimetex.cgi?K_%7Bs%7D+K_%7Bb%7D" /><img src="http://www.diva-portal.org/cgi-bin/mimetex.cgi?K" />KS+KB classes creates effects for S*B state/behavior combinations, where KS and KB is the number of state classes and behavior classes and S and B is the number of states and behaviors in the network. === Cognitive models for virtual characters
author	Wahlström, Jonathan Djupfeldt, Oscar
author_facet	Wahlström, Jonathan Djupfeldt, Oscar
author_sort	Wahlström, Jonathan
title	Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner
title_short	Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner
title_full	Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner
title_fullStr	Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner
title_full_unstemmed	Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner
title_sort	inlärning i emotional behavior networks : online unsupervised reinforcement learning i kontinuerliga domäner
publisher	Linköpings universitet, Institutionen för teknik och naturvetenskap
publishDate	2010
url	http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-54442
work_keys_str_mv	AT wahlstromjonathan inlarningiemotionalbehaviornetworksonlineunsupervisedreinforcementlearningikontinuerligadomaner AT djupfeldtoscar inlarningiemotionalbehaviornetworksonlineunsupervisedreinforcementlearningikontinuerligadomaner AT wahlstromjonathan learninginemotionalbehaviornetworksonlineunsupervisedreinforcementlearningincontinuousdomains AT djupfeldtoscar learninginemotionalbehaviornetworksonlineunsupervisedreinforcementlearningincontinuousdomains
_version_	1718608933564710912

Inlärning i Emotional Behavior Networks : Online Unsupervised Reinforcement Learning i kontinuerliga domäner

Similar Items