Cognitive Mapping for Object Searching in Indoor Scenes
abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively e...
Other Authors: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | http://hdl.handle.net/2286/R.I.55588 |
id |
ndltd-asu.edu-item-55588 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-asu.edu-item-555882020-01-15T03:01:11Z Cognitive Mapping for Object Searching in Indoor Scenes abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and approach the target following acquired strategies. Despite a variety of advances in mobile robotics, enabling an autonomous with above-mentioned abilities is still a challenging and complex task. However, the solution to the task is very likely to accelerate the landing of assistive robots. Reinforcement learning is a method that trains autonomous robot based on rewarding desired behaviors to help it obtain an action policy that maximizes rewards while the robot interacting with the environment. Through trial and error, an agent learns sophisticated and skillful strategies to handle complex tasks in the environment. Inspired by navigation procedures of human beings that when navigating through environments, humans reason about accessible spaces and geometry of the environment a lot based on first-person view, figure out the destination and then ease over, this work develops a model that maps from pixels to actions and inherently estimate the target as well as the free-space map. The model has three major constituents: (i) a cognitive mapper that maps the topologic free-space map from first-person view images, (ii) a target recognition network that locates a desired object and (iii) an action policy deep reinforcement learning network. Further, a planner model with cascade architecture based on multi-scale semantic top-down occupancy map input is proposed. Dissertation/Thesis Zheng, Shibin (Author) Yang, Yezhou (Advisor) Zhang, Wenlong (Committee member) Ren, Yi (Committee member) Arizona State University (Publisher) Computer engineering eng 61 pages Masters Thesis Computer Engineering 2019 Masters Thesis http://hdl.handle.net/2286/R.I.55588 http://rightsstatements.org/vocab/InC/1.0/ 2019 |
collection |
NDLTD |
language |
English |
format |
Dissertation |
sources |
NDLTD |
topic |
Computer engineering |
spellingShingle |
Computer engineering Cognitive Mapping for Object Searching in Indoor Scenes |
description |
abstract: Visual navigation is a multi-disciplinary field across computer vision, machine learning and robotics. It is of great significance in both research and industrial applications. An intelligent agent with visual navigation ability will be capable of performing the following tasks: actively explore in environments, distinguish and localize a requested target and approach the target following acquired strategies. Despite a variety of advances in mobile robotics, enabling an autonomous with above-mentioned abilities is still a challenging and complex task. However, the solution to the task is very likely to accelerate the landing of assistive robots.
Reinforcement learning is a method that trains autonomous robot based on rewarding desired behaviors to help it obtain an action policy that maximizes rewards while the robot interacting with the environment. Through trial and error, an agent learns sophisticated and skillful strategies to handle complex tasks in the environment. Inspired by navigation procedures of human beings that when navigating through environments, humans reason about accessible spaces and geometry of the environment a lot based on first-person view, figure out the destination and then ease over, this work develops a model that maps from pixels to actions and inherently estimate the target as well as the free-space map. The model has three major constituents: (i) a cognitive mapper that maps the topologic free-space map from first-person view images, (ii) a target recognition network that locates a desired object and (iii) an action policy deep reinforcement learning network. Further, a planner model with cascade architecture based on multi-scale semantic top-down occupancy map input is proposed. === Dissertation/Thesis === Masters Thesis Computer Engineering 2019 |
author2 |
Zheng, Shibin (Author) |
author_facet |
Zheng, Shibin (Author) |
title |
Cognitive Mapping for Object Searching in Indoor Scenes |
title_short |
Cognitive Mapping for Object Searching in Indoor Scenes |
title_full |
Cognitive Mapping for Object Searching in Indoor Scenes |
title_fullStr |
Cognitive Mapping for Object Searching in Indoor Scenes |
title_full_unstemmed |
Cognitive Mapping for Object Searching in Indoor Scenes |
title_sort |
cognitive mapping for object searching in indoor scenes |
publishDate |
2019 |
url |
http://hdl.handle.net/2286/R.I.55588 |
_version_ |
1719308522723737600 |