A study in human attention to guide computational action recognition

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. === Cataloged from PDF version of thesis. === Includes bibliographical references (pages 93-95). === Computer vision researchers have a lot to learn from the human visual system....

Full description

Bibliographic Details
Main Author:	Sinai, Sam
Other Authors:	Patrick H. Winston.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2014
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/91871

id	ndltd-MIT-oai-dspace.mit.edu-1721.1-91871
record_format	oai_dc
spelling	ndltd-MIT-oai-dspace.mit.edu-1721.1-918712019-05-02T15:36:16Z A study in human attention to guide computational action recognition Sinai, Sam Patrick H. Winston. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. Cataloged from PDF version of thesis. Includes bibliographical references (pages 93-95). Computer vision researchers have a lot to learn from the human visual system. We, as humans, are usually unaware of how enormously difficult it is to watch a scene and summarize its most important events in words. We only begin to appreciate this truth when we attempt to build a system that performs comparably. In this thesis, I study two features of human visual apparatus: Attention and Peripheral Vision. I then use these to propose heuristics for computational approaches to action recognition. I think that building a system modeled after human vision, with the nonuniform distribution of resolution and processing power, can greatly increase the performance of the computer systems that target action recognition. In this study: (i) I develop and construct tools that allow me to study human vision and its role in action recognition, (ii) I perform four distinct experiments to gain insight into the role of attention and peripheral vision in this task, (iii) I propose computational heuristics, as well as mechanisms, that I believe will increase the efficiency, and recognition power of artificial vision systems. The tools I have developed can be applied to a variety of studies, including those performed on online crowd-sourcing markets (e.g. Amazon's Mechanical Turk). With my human experiments, I demonstrate that there is consistency of visual behavior among multiple subjects when they are asked to report the occurrence of a verb. Further, I demonstrate that while peripheral vision may play a small direct role in action recognition, it is a key component of attentional allocation, whereby it becomes fundamental to action recognition. Moreover, I propose heuristics based on these experiments, that can be informative to the artificial systems. In particular, I argue that the proper medium for action recognition are videos, not still images, and the basic driver of attention should be movement. Finally, I outline a computational mechanism that incorporates these heuristics into an implementable scheme. by Sam Sinai. M. Eng. 2014-11-24T18:41:27Z 2014-11-24T18:41:27Z 2014 2014 Thesis http://hdl.handle.net/1721.1/91871 894355843 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 95 pages application/pdf Massachusetts Institute of Technology
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Electrical Engineering and Computer Science.
spellingShingle	Electrical Engineering and Computer Science. Sinai, Sam A study in human attention to guide computational action recognition
description	Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. === Cataloged from PDF version of thesis. === Includes bibliographical references (pages 93-95). === Computer vision researchers have a lot to learn from the human visual system. We, as humans, are usually unaware of how enormously difficult it is to watch a scene and summarize its most important events in words. We only begin to appreciate this truth when we attempt to build a system that performs comparably. In this thesis, I study two features of human visual apparatus: Attention and Peripheral Vision. I then use these to propose heuristics for computational approaches to action recognition. I think that building a system modeled after human vision, with the nonuniform distribution of resolution and processing power, can greatly increase the performance of the computer systems that target action recognition. In this study: (i) I develop and construct tools that allow me to study human vision and its role in action recognition, (ii) I perform four distinct experiments to gain insight into the role of attention and peripheral vision in this task, (iii) I propose computational heuristics, as well as mechanisms, that I believe will increase the efficiency, and recognition power of artificial vision systems. The tools I have developed can be applied to a variety of studies, including those performed on online crowd-sourcing markets (e.g. Amazon's Mechanical Turk). With my human experiments, I demonstrate that there is consistency of visual behavior among multiple subjects when they are asked to report the occurrence of a verb. Further, I demonstrate that while peripheral vision may play a small direct role in action recognition, it is a key component of attentional allocation, whereby it becomes fundamental to action recognition. Moreover, I propose heuristics based on these experiments, that can be informative to the artificial systems. In particular, I argue that the proper medium for action recognition are videos, not still images, and the basic driver of attention should be movement. Finally, I outline a computational mechanism that incorporates these heuristics into an implementable scheme. === by Sam Sinai. === M. Eng.
author2	Patrick H. Winston.
author_facet	Patrick H. Winston. Sinai, Sam
author	Sinai, Sam
author_sort	Sinai, Sam
title	A study in human attention to guide computational action recognition
title_short	A study in human attention to guide computational action recognition
title_full	A study in human attention to guide computational action recognition
title_fullStr	A study in human attention to guide computational action recognition
title_full_unstemmed	A study in human attention to guide computational action recognition
title_sort	study in human attention to guide computational action recognition
publisher	Massachusetts Institute of Technology
publishDate	2014
url	http://hdl.handle.net/1721.1/91871
work_keys_str_mv	AT sinaisam astudyinhumanattentiontoguidecomputationalactionrecognition AT sinaisam studyinhumanattentiontoguidecomputationalactionrecognition
_version_	1719025178810253312

A study in human attention to guide computational action recognition

Similar Items