Cognitive Mapping for Object Searching in Indoor Scenes

abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively e...

Full description

Bibliographic Details
Other Authors:	Zheng, Shibin (Author)
Format:	Dissertation
Language:	English
Published:	2019
Subjects:	Computer engineering
Online Access:	http://hdl.handle.net/2286/R.I.55588

id	ndltd-asu.edu-item-55588
record_format	oai_dc
spelling	ndltd-asu.edu-item-555882020-01-15T03:01:11Z Cognitive Mapping for Object Searching in Indoor Scenes abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and approach the target following acquired strategies. Despite a variety of advances in mobile robotics, enabling an autonomous with above-mentioned abilities is still a challenging and complex task. However, the solution to the task is very likely to accelerate the landing of assistive robots. Reinforcement learning is a method that trains autonomous robot based on rewarding desired behaviors to help it obtain an action policy that maximizes rewards while the robot interacting with the environment. Through trial and error, an agent learns sophisticated and skillful strategies to handle complex tasks in the environment. Inspired by navigation procedures of human beings that when navigating through environments, humans reason about accessible spaces and geometry of the environment a lot based on first-person view, figure out the destination and then ease over, this work develops a model that maps from pixels to actions and inherently estimate the target as well as the free-space map. The model has three major constituents: (i) a cognitive mapper that maps the topologic free-space map from first-person view images, (ii) a target recognition network that locates a desired object and (iii) an action policy deep reinforcement learning network. Further, a planner model with cascade architecture based on multi-scale semantic top-down occupancy map input is proposed. Dissertation/Thesis Zheng, Shibin (Author) Yang, Yezhou (Advisor) Zhang, Wenlong (Committee member) Ren, Yi (Committee member) Arizona State University (Publisher) Computer engineering eng 61 pages Masters Thesis Computer Engineering 2019 Masters Thesis http://hdl.handle.net/2286/R.I.55588 http://rightsstatements.org/vocab/InC/1.0/ 2019
collection	NDLTD
language	English
format	Dissertation
sources	NDLTD
topic	Computer engineering
spellingShingle	Computer engineering Cognitive Mapping for Object Searching in Indoor Scenes
description	abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and approach the target following acquired strategies. Despite a variety of advances in mobile robotics, enabling an autonomous with above-mentioned abilities is still a challenging and complex task. However, the solution to the task is very likely to accelerate the landing of assistive robots. Reinforcement learning is a method that trains autonomous robot based on rewarding desired behaviors to help it obtain an action policy that maximizes rewards while the robot interacting with the environment. Through trial and error, an agent learns sophisticated and skillful strategies to handle complex tasks in the environment. Inspired by navigation procedures of human beings that when navigating through environments, humans reason about accessible spaces and geometry of the environment a lot based on first-person view, figure out the destination and then ease over, this work develops a model that maps from pixels to actions and inherently estimate the target as well as the free-space map. The model has three major constituents: (i) a cognitive mapper that maps the topologic free-space map from first-person view images, (ii) a target recognition network that locates a desired object and (iii) an action policy deep reinforcement learning network. Further, a planner model with cascade architecture based on multi-scale semantic top-down occupancy map input is proposed. === Dissertation/Thesis === Masters Thesis Computer Engineering 2019
author2	Zheng, Shibin (Author)
author_facet	Zheng, Shibin (Author)
title	Cognitive Mapping for Object Searching in Indoor Scenes
title_short	Cognitive Mapping for Object Searching in Indoor Scenes
title_full	Cognitive Mapping for Object Searching in Indoor Scenes
title_fullStr	Cognitive Mapping for Object Searching in Indoor Scenes
title_full_unstemmed	Cognitive Mapping for Object Searching in Indoor Scenes
title_sort	cognitive mapping for object searching in indoor scenes
publishDate	2019
url	http://hdl.handle.net/2286/R.I.55588
_version_	1719308522723737600

Cognitive Mapping for Object Searching in Indoor Scenes

Similar Items