Human activity prediction using saliency-aware motion enhancement and weighted LSTM network
Abstract In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2021-01-01
|
Series: | EURASIP Journal on Image and Video Processing |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13640-020-00544-0 |
id |
doaj-0c57ac4146924f199d34eb833e7c172c |
---|---|
record_format |
Article |
spelling |
doaj-0c57ac4146924f199d34eb833e7c172c2021-01-17T12:16:47ZengSpringerOpenEURASIP Journal on Image and Video Processing1687-52812021-01-012021112310.1186/s13640-020-00544-0Human activity prediction using saliency-aware motion enhancement and weighted LSTM networkZhengkui Weng0Wuzhao Li1Zhipeng Jin2Jiaxing Vocational and Technical CollegeJiaxing Vocational and Technical CollegeJiaxing Vocational and Technical CollegeAbstract In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction. First, a boundary-prior based motion segmentation method is introduced to use shortest geodesic distance in an undirected weighted graph. Next, a dynamic contrast segmentation strategy is proposed to segment the moving object in a complex environment. Then, the SME is constructed to enhance the moving object by suppressing irrelevant background in each frame. Moreover, an effective long-range attention mechanism is designed to further deal with the long-term dependency of complex non-periodic activities by automatically focusing more on the semantic critical frames instead of processing all sampled frames equally. Thus, the learned weights can highlight the discriminative frames and reduce the temporal redundancy. Finally, we evaluate our framework on the UT-Interaction and sub-JHMDB datasets. The experimental results show that WLSTM with SME statistically outperforms a number of state-of-the-art methods on both datasets.https://doi.org/10.1186/s13640-020-00544-0Activity predictionWeighted long short-term memory networkDynamic contrast segmentationSaliency-aware motion enhancement |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Zhengkui Weng Wuzhao Li Zhipeng Jin |
spellingShingle |
Zhengkui Weng Wuzhao Li Zhipeng Jin Human activity prediction using saliency-aware motion enhancement and weighted LSTM network EURASIP Journal on Image and Video Processing Activity prediction Weighted long short-term memory network Dynamic contrast segmentation Saliency-aware motion enhancement |
author_facet |
Zhengkui Weng Wuzhao Li Zhipeng Jin |
author_sort |
Zhengkui Weng |
title |
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network |
title_short |
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network |
title_full |
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network |
title_fullStr |
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network |
title_full_unstemmed |
Human activity prediction using saliency-aware motion enhancement and weighted LSTM network |
title_sort |
human activity prediction using saliency-aware motion enhancement and weighted lstm network |
publisher |
SpringerOpen |
series |
EURASIP Journal on Image and Video Processing |
issn |
1687-5281 |
publishDate |
2021-01-01 |
description |
Abstract In recent years, great progress has been made in recognizing human activities in complete image sequences. However, predicting human activity earlier in a video is still a challenging task. In this paper, a novel framework named weighted long short-term memory network (WLSTM) with saliency-aware motion enhancement (SME) is proposed for video activity prediction. First, a boundary-prior based motion segmentation method is introduced to use shortest geodesic distance in an undirected weighted graph. Next, a dynamic contrast segmentation strategy is proposed to segment the moving object in a complex environment. Then, the SME is constructed to enhance the moving object by suppressing irrelevant background in each frame. Moreover, an effective long-range attention mechanism is designed to further deal with the long-term dependency of complex non-periodic activities by automatically focusing more on the semantic critical frames instead of processing all sampled frames equally. Thus, the learned weights can highlight the discriminative frames and reduce the temporal redundancy. Finally, we evaluate our framework on the UT-Interaction and sub-JHMDB datasets. The experimental results show that WLSTM with SME statistically outperforms a number of state-of-the-art methods on both datasets. |
topic |
Activity prediction Weighted long short-term memory network Dynamic contrast segmentation Saliency-aware motion enhancement |
url |
https://doi.org/10.1186/s13640-020-00544-0 |
work_keys_str_mv |
AT zhengkuiweng humanactivitypredictionusingsaliencyawaremotionenhancementandweightedlstmnetwork AT wuzhaoli humanactivitypredictionusingsaliencyawaremotionenhancementandweightedlstmnetwork AT zhipengjin humanactivitypredictionusingsaliencyawaremotionenhancementandweightedlstmnetwork |
_version_ |
1724335086434779136 |