Human activity prediction using saliency-aware motion enhancement and weighted LSTM network

Abstract In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-...

Full description

Bibliographic Details
Main Authors: Zhengkui Weng, Wuzhao Li, Zhipeng Jin
Format: Article
Language:English
Published: SpringerOpen 2021-01-01
Series:EURASIP Journal on Image and Video Processing
Subjects:
Online Access:https://doi.org/10.1186/s13640-020-00544-0
id doaj-0c57ac4146924f199d34eb833e7c172c
record_format Article
spelling doaj-0c57ac4146924f199d34eb833e7c172c2021-01-17T12:16:47ZengSpringerOpenEURASIP Journal on Image and Video Processing1687-52812021-01-012021112310.1186/s13640-020-00544-0Human activity prediction using saliency-aware motion enhancement and weighted LSTM networkZhengkui Weng0Wuzhao Li1Zhipeng Jin2Jiaxing Vocational and Technical CollegeJiaxing Vocational and Technical CollegeJiaxing Vocational and Technical CollegeAbstract In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction. First, a boundary-prior based motion segmentation method is introduced to use shortest geodesic distance in an undirected weighted graph. Next, a dynamic contrast segmentation strategy is proposed to segment the moving object in a complex environment. Then, the SME is constructed to enhance the moving object by suppressing irrelevant background in each frame. Moreover, an effective long-range attention mechanism is designed to further deal with the long-term dependency of complex non-periodic activities by automatically focusing more on the semantic critical frames instead of processing all sampled frames equally. Thus, the learned weights can highlight the discriminative frames and reduce the temporal redundancy. Finally, we evaluate our framework on the UT-Interaction and sub-JHMDB datasets. The experimental results show that WLSTM with SME statistically outperforms a number of state-of-the-art methods on both datasets.https://doi.org/10.1186/s13640-020-00544-0Activity predictionWeighted long short-term memory networkDynamic contrast segmentationSaliency-aware motion enhancement
collection DOAJ
language English
format Article
sources DOAJ
author Zhengkui Weng
Wuzhao Li
Zhipeng Jin
spellingShingle Zhengkui Weng
Wuzhao Li
Zhipeng Jin
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
EURASIP Journal on Image and Video Processing
Activity prediction
Weighted long short-term memory network
Dynamic contrast segmentation
Saliency-aware motion enhancement
author_facet Zhengkui Weng
Wuzhao Li
Zhipeng Jin
author_sort Zhengkui Weng
title Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
title_short Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
title_full Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
title_fullStr Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
title_full_unstemmed Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
title_sort human activity prediction using saliency-aware motion enhancement and weighted lstm network
publisher SpringerOpen
series EURASIP Journal on Image and Video Processing
issn 1687-5281
publishDate 2021-01-01
description Abstract In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction. First, a boundary-prior based motion segmentation method is introduced to use shortest geodesic distance in an undirected weighted graph. Next, a dynamic contrast segmentation strategy is proposed to segment the moving object in a complex environment. Then, the SME is constructed to enhance the moving object by suppressing irrelevant background in each frame. Moreover, an effective long-range attention mechanism is designed to further deal with the long-term dependency of complex non-periodic activities by automatically focusing more on the semantic critical frames instead of processing all sampled frames equally. Thus, the learned weights can highlight the discriminative frames and reduce the temporal redundancy. Finally, we evaluate our framework on the UT-Interaction and sub-JHMDB datasets. The experimental results show that WLSTM with SME statistically outperforms a number of state-of-the-art methods on both datasets.
topic Activity prediction
Weighted long short-term memory network
Dynamic contrast segmentation
Saliency-aware motion enhancement
url https://doi.org/10.1186/s13640-020-00544-0
work_keys_str_mv AT zhengkuiweng humanactivitypredictionusingsaliencyawaremotionenhancementandweightedlstmnetwork
AT wuzhaoli humanactivitypredictionusingsaliencyawaremotionenhancementandweightedlstmnetwork
AT zhipengjin humanactivitypredictionusingsaliencyawaremotionenhancementandweightedlstmnetwork
_version_ 1724335086434779136