Learning from Noisy Labels for Visual Question Answering

碩士 === 國立交通大學 === 多媒體工程研究所 === 107 === This thesis conducts a study of learning algorithms to address noisy label issues inherent in Visual Question Answering (VQA) tasks. The noisy labelling in VQA tasks refers to the phenomenon of possibly collecting different answers to an image-question pair fro...

Full description

Bibliographic Details
Main Authors: Pan, Wei-Zhi, 潘韋志
Other Authors: Peng, Wen-Hsiao
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/jnaynq
id ndltd-TW-107NCTU5641007
record_format oai_dc
spelling ndltd-TW-107NCTU56410072019-05-16T01:40:47Z http://ndltd.ncl.edu.tw/handle/jnaynq Learning from Noisy Labels for Visual Question Answering 利用錯誤標籤學習演算法於視覺問答應用 Pan, Wei-Zhi 潘韋志 碩士 國立交通大學 多媒體工程研究所 107 This thesis conducts a study of learning algorithms to address noisy label issues inherent in Visual Question Answering (VQA) tasks. The noisy labelling in VQA tasks refers to the phenomenon of possibly collecting different answers to an image-question pair from different human subjects. This often arises because some image-question pairs may create an ambiguous context that leads to indefinite answers. When trained with such noisy supervision, the performance of the VQA model suffers. To address noisy label issues, we first survey three mainstream algorithms for learning from noisy labels, including (1) loss-correction, (2) label cleansing and (3) graphical models. We then implement these algorithms based on a dual attention VQA network (which we call the base VQA model) and test their performance on VirginiaTech VQA dataset. Experimental results show that (1) the performances of the loss-correction algorithms rely heavily on accurate estimation of label transition probabilities due to noise or accurate detection of noise level, that (2) the label cleansing algorithms require enough verified labels to perform effectively, and that (3) the graphical models need to differentiate the noise level of each QA input to work well. In addition, the capability of the base VQA model can have a profound effect on the performances of these noisy label learning algorithms. Peng, Wen-Hsiao 彭文孝 2018 學位論文 ; thesis 40 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 多媒體工程研究所 === 107 === This thesis conducts a study of learning algorithms to address noisy label issues inherent in Visual Question Answering (VQA) tasks. The noisy labelling in VQA tasks refers to the phenomenon of possibly collecting different answers to an image-question pair from different human subjects. This often arises because some image-question pairs may create an ambiguous context that leads to indefinite answers. When trained with such noisy supervision, the performance of the VQA model suffers. To address noisy label issues, we first survey three mainstream algorithms for learning from noisy labels, including (1) loss-correction, (2) label cleansing and (3) graphical models. We then implement these algorithms based on a dual attention VQA network (which we call the base VQA model) and test their performance on VirginiaTech VQA dataset. Experimental results show that (1) the performances of the loss-correction algorithms rely heavily on accurate estimation of label transition probabilities due to noise or accurate detection of noise level, that (2) the label cleansing algorithms require enough verified labels to perform effectively, and that (3) the graphical models need to differentiate the noise level of each QA input to work well. In addition, the capability of the base VQA model can have a profound effect on the performances of these noisy label learning algorithms.
author2 Peng, Wen-Hsiao
author_facet Peng, Wen-Hsiao
Pan, Wei-Zhi
潘韋志
author Pan, Wei-Zhi
潘韋志
spellingShingle Pan, Wei-Zhi
潘韋志
Learning from Noisy Labels for Visual Question Answering
author_sort Pan, Wei-Zhi
title Learning from Noisy Labels for Visual Question Answering
title_short Learning from Noisy Labels for Visual Question Answering
title_full Learning from Noisy Labels for Visual Question Answering
title_fullStr Learning from Noisy Labels for Visual Question Answering
title_full_unstemmed Learning from Noisy Labels for Visual Question Answering
title_sort learning from noisy labels for visual question answering
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/jnaynq
work_keys_str_mv AT panweizhi learningfromnoisylabelsforvisualquestionanswering
AT pānwéizhì learningfromnoisylabelsforvisualquestionanswering
AT panweizhi lìyòngcuòwùbiāoqiānxuéxíyǎnsuànfǎyúshìjuéwèndáyīngyòng
AT pānwéizhì lìyòngcuòwùbiāoqiānxuéxíyǎnsuànfǎyúshìjuéwèndáyīngyòng
_version_ 1719178791290404864