Multimodal Feature Learning for Video Captioning

Video captioning refers to the task of generating a natural language sentence that explains the content of the input video clips. This study proposes a deep neural network model for effective video captioning. Apart from visual features, the proposed model learns additionally semantic features that...

Full description

Bibliographic Details
Main Authors:	Sujin Lee, Incheol Kim
Format:	Article
Language:	English
Published:	Hindawi Limited 2018-01-01
Series:	Mathematical Problems in Engineering
Online Access:	http://dx.doi.org/10.1155/2018/3125879

Internet

http://dx.doi.org/10.1155/2018/3125879

Multimodal Feature Learning for Video Captioning

Internet

Similar Items