APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO

Made available in DSpace on 2016-08-17T14:53:16Z (GMT). No. of bitstreams: 1 Leandro Rocha Lopes.pdf: 1075564 bytes, checksum: 01e184ed6d7c65323c0dfc1515da19a3 (MD5) Previous issue date: 2011-04-04 === Due to the increasing of technological development and its associated industrial applications,...

Full description

Bibliographic Details
Main Author:	Lopes, Leandro Rocha
Other Authors:	Fonseca Neto, João Viana da
Format:	Others
Language:	Portuguese
Published:	Universidade Federal do Maranhão 2016
Subjects:	Programação Dinâmica Controle ótimo HDP Q-Function ADHDP Sistemas Multivariáveis Convergência DLQR Dynamic Programming Optimal Control Multivariable Systems Convergence CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::ANALISE DE ALGORITMOS E COMPLEXIDADE DE COMPUTACAO
Online Access:	http://tedebc.ufma.br:8080/jspui/handle/tede/462

id	ndltd-IBICT-oai-tede2-tede-462
record_format	oai_dc
spelling	ndltd-IBICT-oai-tede2-tede-4622019-01-22T00:41:42Z APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO LEARNING BY STRENGTHENING AND ADAPTIVE DYNAMIC PROGRAMMING FOR DESIGN AND EVALUATION OF PERFORMANCE DLQR ALGORITHMS IN MIMO SYSTEMS Lopes, Leandro Rocha Fonseca Neto, João Viana da Programação Dinâmica Controle ótimo HDP Q-Function ADHDP Sistemas Multivariáveis Convergência DLQR Dynamic Programming Optimal Control HDP Q-Function ADHDP Multivariable Systems Convergence DLQR CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::ANALISE DE ALGORITMOS E COMPLEXIDADE DE COMPUTACAO Made available in DSpace on 2016-08-17T14:53:16Z (GMT). No. of bitstreams: 1 Leandro Rocha Lopes.pdf: 1075564 bytes, checksum: 01e184ed6d7c65323c0dfc1515da19a3 (MD5) Previous issue date: 2011-04-04 Due to the increasing of technological development and its associated industrial applications, control design methods to attend high performance requests and reinforcement learning are been developed, not only, to solve new problems, as well as, to improve the performance of implemented controllers in the real systems. The reinforcement learning (RL) and discrete linear quadratic regulator (DLQR) approaches are connected by adaptive dynamic programming (ADP). This connection is oriented to the design of optimal controller for multivariable systems (MIMO). The proposed method for DLQR controllers tuning can been heuristic guidance for biased variations in weighting matrices of instantenous reward. The heuristics performance are evaluated in terms of convergence of heuristic dynamic programming (HDP) and action dependent (AD-HDP) algorithms. The algorithms and tuning are evaluated by the capability to map the plane-Z in MIMO dynamic system of third order. Em decorrência do crescente desenvolvimento tecnológico e das consequentes aplicações industriais, técnicas de controle de alto desempenho e aprendizado por reforço estão sendo desenvolvidas não só para solucionar novos problemas, mas também para melhorar o desempenho de controladores já implementados em sistemas do mundo real. As abordagens do aprendizado por reforço e do regulador linear quadrático discreto (DLQR) são conectadas pelos métodos de programação dinâmica adaptativa. Esta união é orientada para o projeto de controladores ótimos em sistemas multivariáveis (MIMO). O método proposto para sintonia de controladores DLQR fornece diretrizes para construção de heurísticas polarizadas que são aplicadas na seleção das matrizes de ponderação da recompensa instantânea. Investiga-se o desempenho das heurísticas associadas com a sintonia de controladores lineares discretos e aspectos de convergência que estão relacionados com as variações QR nos algoritmos de programação dinâmica heurística (HDP) e Ação Dependente (ADHDP). Os algoritmos e a sintonia são avaliados pela capacidade em estabelecer a política de controle ótimo que mapeia o plano-Z em um sistema dinãmico multivariável de terceira ordem. 2016-08-17T14:53:16Z 2011-05-11 2011-04-04 info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/masterThesis LOPES, Leandro Rocha. LEARNING BY STRENGTHENING AND ADAPTIVE DYNAMIC PROGRAMMING FOR DESIGN AND EVALUATION OF PERFORMANCE DLQR ALGORITHMS IN MIMO SYSTEMS. 2011. 130 f. Dissertação (Mestrado em Engenharia) - Universidade Federal do Maranhão, São Luis, 2011. http://tedebc.ufma.br:8080/jspui/handle/tede/462 por info:eu-repo/semantics/openAccess application/pdf Universidade Federal do Maranhão PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA DE ELETRICIDADE/CCET UFMA BR Engenharia reponame:Biblioteca Digital de Teses e Dissertações da UFMA instname:Universidade Federal do Maranhão instacron:UFMA
collection	NDLTD
language	Portuguese
format	Others
sources	NDLTD
topic	Programação Dinâmica Controle ótimo HDP Q-Function ADHDP Sistemas Multivariáveis Convergência DLQR Dynamic Programming Optimal Control HDP Q-Function ADHDP Multivariable Systems Convergence DLQR CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::ANALISE DE ALGORITMOS E COMPLEXIDADE DE COMPUTACAO
spellingShingle	Programação Dinâmica Controle ótimo HDP Q-Function ADHDP Sistemas Multivariáveis Convergência DLQR Dynamic Programming Optimal Control HDP Q-Function ADHDP Multivariable Systems Convergence DLQR CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO::ANALISE DE ALGORITMOS E COMPLEXIDADE DE COMPUTACAO Lopes, Leandro Rocha APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO
description	Made available in DSpace on 2016-08-17T14:53:16Z (GMT). No. of bitstreams: 1 Leandro Rocha Lopes.pdf: 1075564 bytes, checksum: 01e184ed6d7c65323c0dfc1515da19a3 (MD5) Previous issue date: 2011-04-04 === Due to the increasing of technological development and its associated industrial applications, control design methods to attend high performance requests and reinforcement learning are been developed, not only, to solve new problems, as well as, to improve the performance of implemented controllers in the real systems. The reinforcement learning (RL) and discrete linear quadratic regulator (DLQR) approaches are connected by adaptive dynamic programming (ADP). This connection is oriented to the design of optimal controller for multivariable systems (MIMO). The proposed method for DLQR controllers tuning can been heuristic guidance for biased variations in weighting matrices of instantenous reward. The heuristics performance are evaluated in terms of convergence of heuristic dynamic programming (HDP) and action dependent (AD-HDP) algorithms. The algorithms and tuning are evaluated by the capability to map the plane-Z in MIMO dynamic system of third order. === Em decorrência do crescente desenvolvimento tecnológico e das consequentes aplicações industriais, técnicas de controle de alto desempenho e aprendizado por reforço estão sendo desenvolvidas não só para solucionar novos problemas, mas também para melhorar o desempenho de controladores já implementados em sistemas do mundo real. As abordagens do aprendizado por reforço e do regulador linear quadrático discreto (DLQR) são conectadas pelos métodos de programação dinâmica adaptativa. Esta união é orientada para o projeto de controladores ótimos em sistemas multivariáveis (MIMO). O método proposto para sintonia de controladores DLQR fornece diretrizes para construção de heurísticas polarizadas que são aplicadas na seleção das matrizes de ponderação da recompensa instantânea. Investiga-se o desempenho das heurísticas associadas com a sintonia de controladores lineares discretos e aspectos de convergência que estão relacionados com as variações QR nos algoritmos de programação dinâmica heurística (HDP) e Ação Dependente (ADHDP). Os algoritmos e a sintonia são avaliados pela capacidade em estabelecer a política de controle ótimo que mapeia o plano-Z em um sistema dinãmico multivariável de terceira ordem.
author2	Fonseca Neto, João Viana da
author_facet	Fonseca Neto, João Viana da Lopes, Leandro Rocha
author	Lopes, Leandro Rocha
author_sort	Lopes, Leandro Rocha
title	APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO
title_short	APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO
title_full	APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO
title_fullStr	APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO
title_full_unstemmed	APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO
title_sort	aprendizagem por reforço e programacão dinâmica adaptativa para projeto e avaliação do desempenho de algoritmos dlqr em sistemas mimo
publisher	Universidade Federal do Maranhão
publishDate	2016
url	http://tedebc.ufma.br:8080/jspui/handle/tede/462
work_keys_str_mv	AT lopesleandrorocha aprendizagemporreforcoeprogramacaodinamicaadaptativaparaprojetoeavaliacaododesempenhodealgoritmosdlqremsistemasmimo AT lopesleandrorocha learningbystrengtheningandadaptivedynamicprogrammingfordesignandevaluationofperformancedlqralgorithmsinmimosystems
_version_	1718925662241161216

APRENDIZAGEM POR REFORÇO E PROGRAMACÃO DINÂMICA ADAPTATIVA PARA PROJETO E AVALIAÇÃO DO DESEMPENHO DE ALGORITMOS DLQR EM SISTEMAS MIMO

Similar Items