Co-Attention Network With Question Type for Visual Question Answering

Visual Question Answering (VQA) is a challenging multi-modal learning task since it requires an understanding of both visual and textual modalities simultaneously. Therefore, the approaches used to represent the images and questions in a fine-grained manner play key roles in the performance. In orde...

Full description

Bibliographic Details
Main Authors: Chao Yang, Mengqi Jiang, Bin Jiang, Weixin Zhou, Keqin Li
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8676009/