DVC‐Net: A deep neural network model for dense video captioning

Abstract Dense video captioning (DVC) detects multiple events in an input video and generates natural language sentences to describe each event. Previous studies predominantly used convolutional neural networks to extract visual features from videos but failed to employ high‐level semantics to effec...

Full description

Bibliographic Details
Main Authors: Sujin Lee, Incheol Kim
Format: Article
Language:English
Published: Wiley 2021-02-01
Series:IET Computer Vision
Online Access:https://doi.org/10.1049/cvi2.12013