A Fine-Grained Spatial-Temporal Attention Model for Video Captioning

Attention mechanism has been extensively used in video captioning tasks, which enables further development of deeper visual understanding. However, most existing video captioning methods apply the attention mechanism on the frame level, which only model the temporal structure and generated words, bu...

Full description

Bibliographic Details
Main Authors:	An-An Liu, Yurui Qiu, Yongkang Wong, Yu-Ting Su, Mohan Kankanhalli
Format:	Article
Language:	English
Published:	IEEE 2018-01-01
Series:	IEEE Access
Subjects:	Fine-grained spatial-temporal mask pooling video captioning
Online Access:	https://ieeexplore.ieee.org/document/8523661/

Internet

https://ieeexplore.ieee.org/document/8523661/

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning

Internet

Similar Items