End-to-End Sentence-Level Multi-View Lipreading Architecture with Spatial Attention Module Integrated Multiple CNNs and Cascaded Local Self-Attention-CTC

Concomitant with the recent advances in deep learning, automatic speech recognition and visual speech recognition (VSR) have received considerable attention. However, although VSR systems must identify speech from both frontal and profile faces in real-world scenarios, most VSR studies have focused...

Full description

Bibliographic Details
Main Authors:	Jeon, S. (Author), Kim, M.S (Author)
Format:	Article
Language:	English
Published:	MDPI 2022
Subjects:	attention mech-anism Attention mech-anism Classification (of information) connectionist temporal classification Connectionist temporal classification Convolution convolutional neural network Convolutional neural network Convolutional neural networks deep learning Deep learning lipreading Lipreading local self-attention Local self-attention Multi-view visual speech recognition multi-view VSR Multi-views Network architecture Spatial attention spatial attention module Spatial attention module Speech Speech recognition Temporal classification visual speech recognition Visual speech recognition
Online Access:	View Fulltext in Publisher

Internet

View Fulltext in Publisher

End-to-End Sentence-Level Multi-View Lipreading Architecture with Spatial Attention Module Integrated Multiple CNNs and Cascaded Local Self-Attention-CTC

Internet

Similar Items