A Fine-Grained Spatial-Temporal Attention Model for Video Captioning

Attention mechanism has been extensively used in video captioning tasks, which enables further development of deeper visual understanding. However, most existing video captioning methods apply the attention mechanism on the frame level, which only model the temporal structure and generated words, bu...

Full description

Bibliographic Details
Main Authors: An-An Liu, Yurui Qiu, Yongkang Wong, Yu-Ting Su, Mohan Kankanhalli
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8523661/

Similar Items