Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks
Sign language recognition aims to recognize meaningful movements of hand gestures and is a significant solution in intelligent communication between the deaf community and hearing societies. However, until now, the current dynamic sign language recognition methods still have some drawbacks with diff...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8667292/ |
id |
doaj-fa345f1359b04c59bbf2db6c2b9e6245 |
---|---|
record_format |
Article |
spelling |
doaj-fa345f1359b04c59bbf2db6c2b9e62452021-04-05T16:59:44ZengIEEEIEEE Access2169-35362019-01-017380443805410.1109/ACCESS.2019.29047498667292Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual NetworksYanqiu Liao0Pengwen Xiong1Weidong Min2https://orcid.org/0000-0003-2526-2181Weiqiong Min3Jiahao Lu4School of Information Engineering, Nanchang University, Nanchang, ChinaSchool of Information Engineering, Nanchang University, Nanchang, ChinaSchool of Software, Nanchang University, Nanchang, ChinaSchool of Tourism, Jiangxi Science & Technology Normal University, Nanchang, ChinaSchool of Information Engineering, Nanchang University, Nanchang, ChinaSign language recognition aims to recognize meaningful movements of hand gestures and is a significant solution in intelligent communication between the deaf community and hearing societies. However, until now, the current dynamic sign language recognition methods still have some drawbacks with difficulties of recognizing complex hand gestures, low recognition accuracy for most dynamic sign language recognition, and potential problems in larger video sequence data training. In order to solve these issues, this paper presents a multimodal dynamic sign language recognition method based on a deep 3-dimensional residual ConvNet and bi-directional LSTM networks, which is named as BLSTM-3D residual network (B3D ResNet). This method consists of three main parts. First, the hand object is localized in the video frames in order to reduce the time and space complexity of network calculation. Then, the B3D ResNet automatically extracts the spatiotemporal features from the video sequences and establishes an intermediate score corresponding to each action in the video sequence after feature analysis. Finally, by classifying the video sequences, the dynamic sign language is accurately identified. The experiment is conducted on test datasets, including DEVISIGN_D dataset and SLR_Dataset. The results show that the proposed method can obtain state-of-the-art recognition accuracy (89.8% on the DEVISIGN_D dataset and 86.9% on SLR_Dataset). In addition, the B3D ResNet can effectively recognize complex hand gestures through larger video sequence data, and obtain high recognition accuracy for 500 vocabularies from Chinese hand sign language.https://ieeexplore.ieee.org/document/8667292/Dynamic sign language recognitionbi-directional LSTMresidual ConvNetvideo sequence |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yanqiu Liao Pengwen Xiong Weidong Min Weiqiong Min Jiahao Lu |
spellingShingle |
Yanqiu Liao Pengwen Xiong Weidong Min Weiqiong Min Jiahao Lu Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks IEEE Access Dynamic sign language recognition bi-directional LSTM residual ConvNet video sequence |
author_facet |
Yanqiu Liao Pengwen Xiong Weidong Min Weiqiong Min Jiahao Lu |
author_sort |
Yanqiu Liao |
title |
Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks |
title_short |
Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks |
title_full |
Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks |
title_fullStr |
Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks |
title_full_unstemmed |
Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks |
title_sort |
dynamic sign language recognition based on video sequence with blstm-3d residual networks |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Sign language recognition aims to recognize meaningful movements of hand gestures and is a significant solution in intelligent communication between the deaf community and hearing societies. However, until now, the current dynamic sign language recognition methods still have some drawbacks with difficulties of recognizing complex hand gestures, low recognition accuracy for most dynamic sign language recognition, and potential problems in larger video sequence data training. In order to solve these issues, this paper presents a multimodal dynamic sign language recognition method based on a deep 3-dimensional residual ConvNet and bi-directional LSTM networks, which is named as BLSTM-3D residual network (B3D ResNet). This method consists of three main parts. First, the hand object is localized in the video frames in order to reduce the time and space complexity of network calculation. Then, the B3D ResNet automatically extracts the spatiotemporal features from the video sequences and establishes an intermediate score corresponding to each action in the video sequence after feature analysis. Finally, by classifying the video sequences, the dynamic sign language is accurately identified. The experiment is conducted on test datasets, including DEVISIGN_D dataset and SLR_Dataset. The results show that the proposed method can obtain state-of-the-art recognition accuracy (89.8% on the DEVISIGN_D dataset and 86.9% on SLR_Dataset). In addition, the B3D ResNet can effectively recognize complex hand gestures through larger video sequence data, and obtain high recognition accuracy for 500 vocabularies from Chinese hand sign language. |
topic |
Dynamic sign language recognition bi-directional LSTM residual ConvNet video sequence |
url |
https://ieeexplore.ieee.org/document/8667292/ |
work_keys_str_mv |
AT yanqiuliao dynamicsignlanguagerecognitionbasedonvideosequencewithblstm3dresidualnetworks AT pengwenxiong dynamicsignlanguagerecognitionbasedonvideosequencewithblstm3dresidualnetworks AT weidongmin dynamicsignlanguagerecognitionbasedonvideosequencewithblstm3dresidualnetworks AT weiqiongmin dynamicsignlanguagerecognitionbasedonvideosequencewithblstm3dresidualnetworks AT jiahaolu dynamicsignlanguagerecognitionbasedonvideosequencewithblstm3dresidualnetworks |
_version_ |
1721540544849510400 |