Multi-View Attention Network for Visual Dialog

Visual dialog is a challenging vision-language task in which a series of questions visually grounded by a given image are answered. To resolve the visual dialog task, a high-level understanding of various multimodal inputs (e.g., question, dialog history, and image) is required. Specifically, it is...

Full description

Bibliographic Details
Main Authors: Sungjin Park, Taesun Whang, Yeochan Yoon, Heuiseok Lim
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/7/3009