A Fine-Grained Spatial-Temporal Attention Model for Video Captioning

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning

Attention mechanism has been extensively used in video captioning tasks, which enables further development of deeper visual understanding. However, most existing video captioning methods apply the attention mechanism on the frame level, which only model the temporal structure and generated words, bu...

Full description

Bibliographic Details
Main Authors:	An-An Liu, Yurui Qiu, Yongkang Wong, Yu-Ting Su, Mohan Kankanhalli
Format:	Article
Language:	English
Published:	IEEE 2018-01-01
Series:	IEEE Access
Subjects:	Fine-grained spatial-temporal mask pooling video captioning
Online Access:	https://ieeexplore.ieee.org/document/8523661/

Similar Items

Sequential Dual Attention: Coarse-to-Fine-Grained Hierarchical Generation for Image Captioning
by: Zhibin Guan, et al.
Published: (2018-11-01)

Video captioning with stacked attention and semantic hard pull
by: Md. Mushfiqur Rahman, et al.
Published: (2021-08-01)

Video Caption Based Searching Using End-to-End Dense Captioning and Sentence Embeddings
by: Akshay Aggarwal, et al.
Published: (2020-06-01)

Variational Autoencoder-Based Multiple Image Captioning Using a Caption Attention Map
by: Boeun Kim, et al.
Published: (2019-07-01)

Video Captioning Based on Channel Soft Attention and Semantic Reconstructor
by: Zhou Lei, et al.
Published: (2021-02-01)

Video Captioning With Adaptive Attention and Mixed Loss Optimization
by: Huanhou Xiao, et al.
Published: (2019-01-01)

A Video Captioning Method Based on Multi-Representation Switching for Sustainable Computing
by: Heechan Kim, et al.
Published: (2021-02-01)

Automatic Image and Video Caption Generation With Deep Learning: A Concise Review and Algorithmic Overlap
by: Soheyla Amirian, et al.
Published: (2020-01-01)

Hierarchical Attention-Based Fusion for Image Caption With Multi-Grained Rewards
by: Chunlei Wu, et al.
Published: (2020-01-01)

Landslide Image Captioning Method Based on Semantic Gate and Bi-Temporal LSTM
by: Wenqi Cui, et al.
Published: (2020-03-01)

Understanding Objects in Video: Object-Oriented Video Captioning via Structured Trajectory and Adversarial Learning
by: Fangyi Zhu, et al.
Published: (2020-01-01)

VAA: Visual Aligning Attention Model for Remote Sensing Image Captioning
by: Zhengyuan Zhang, et al.
Published: (2019-01-01)

VIDEO SCENE DETECTION USING CLOSED CAPTION TEXT
by: Smith, Gregory
Published: (2009)

An Attentive Fourier-Augmented Image-Captioning Transformer
by: Raymond Ian Osolo, et al.
Published: (2021-09-01)

Multilayer Dense Attention Model for Image Caption
by: Ke Wang, et al.
Published: (2019-01-01)

Activity retrieval in closed captioned videos
by: Gupta, Sonal
Published: (2010)

The effects of captioning texts and caption ordering on L2 listening comprehension and vocabulary learning
by: Fatemeh Alikhani, et al.
Published: (2013-07-01)

Panoptic Segmentation-Based Attention for Image Captioning
by: Wenjie Cai, et al.
Published: (2020-01-01)

A new architecture of neural network for fine-grained video analysis based on visual attention
by: LI Lin, et al.
Published: (2019-08-01)

A Semantics-Assisted Video Captioning Model Trained With Scheduled Sampling
by: Haoran Chen, et al.
Published: (2020-09-01)

Towards Generating and Evaluating Iconographic Image Captions of Artworks
by: Eva Cetinic
Published: (2021-07-01)

Caption for Cover of Volume 2 Issue 1
by: Amy Christian
Published: (2011-12-01)

Empirical autopsy of deep video captioning encoder-decoder architecture
by: Nayyer Aafaq, et al.
Published: (2021-03-01)

Examining the Educational Benefits of and Attitudes Toward Closed-Captioning Among Undergraduate Students
by: Bryan Dallas, et al.
Published: (2016-04-01)

Video Question-Answering Techniques, Benchmark Datasets and Evaluation Metrics Leveraging Video Captioning: A Comprehensive Survey
by: Khushboo Khurana, et al.
Published: (2021-01-01)

Captioning Transformer with Stacked Attention Modules
by: Xinxin Zhu, et al.
Published: (2018-05-01)

Multi-Gate Attention Network for Image Captioning
by: Weitao Jiang, et al.
Published: (2021-01-01)

Hybrid Attention Distribution and Factorized Embedding Matrix in Image Captioning
by: Jian Wang, et al.
Published: (2020-01-01)

Structure Preserving Convolutional Attention for Image Captioning
by: Shichen Lu, et al.
Published: (2019-07-01)

Text Augmentation Using BERT for Image Captioning
by: Viktar Atliha, et al.
Published: (2020-08-01)

Cross-Lingual Image Caption Generation Based on Visual Attention Model
by: Bin Wang, et al.
Published: (2020-01-01)

Social Image Captioning: Exploring Visual Attention and User Attention
by: Leiquan Wang, et al.
Published: (2018-02-01)

Fully Convolutional CaptionNet: Siamese Difference Captioning Attention Model
by: Ariyo Oluwasanmi, et al.
Published: (2019-01-01)

A Multi-Level Attention Model for Remote Sensing Image Captions
by: Yangyang Li, et al.
Published: (2020-03-01)

Component based comparative analysis of each module in image captioning
by: Seoung-Ho Choi, et al.
Published: (2021-03-01)

CaptionNet: Automatic End-to-End Siamese Difference Captioning Model With Attention
by: Ariyo Oluwasanmi, et al.
Published: (2019-01-01)

Five Directions: An Essay in Maps and Captions
by: A. Kendra Greene
Published: (2011-01-01)

Cascade Semantic Fusion for Image Captioning
by: Shiwei Wang, et al.
Published: (2019-01-01)

Boosted Transformer for Image Captioning
by: Jiangyun Li, et al.
Published: (2019-08-01)

Fine-Grained Rate Shaping for Video Streaming over Wireless Networks
by: Chen Tsuhan, et al.
Published: (2004-01-01)