A novel multi-stream method for violent interaction detection using deep learning
Violent interaction detection is a hot topic in computer vision. However, the recent research works on violent interaction detection mainly focus on the traditional hand-craft features, and does not make full use of the research results of deep learning in computer vision. In this paper, we propose...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2020-05-01
|
Series: | Measurement + Control |
Online Access: | https://doi.org/10.1177/0020294020902788 |
id |
doaj-491ff4dddddb450a9c4a6642aa5f1561 |
---|---|
record_format |
Article |
spelling |
doaj-491ff4dddddb450a9c4a6642aa5f15612020-11-25T03:51:43ZengSAGE PublishingMeasurement + Control0020-29402020-05-015310.1177/0020294020902788A novel multi-stream method for violent interaction detection using deep learningHongchang LiJing WangJianjun HanJinmin ZhangYushan YangYue ZhaoViolent interaction detection is a hot topic in computer vision. However, the recent research works on violent interaction detection mainly focus on the traditional hand-craft features, and does not make full use of the research results of deep learning in computer vision. In this paper, we propose a new robust violent interaction detection framework based on multi-stream deep learning in surveillance scene. The proposed approach enhances the recognition performance of violent action in video by fusing three different streams: attention-based spatial RGB stream, temporal stream, and local spatial stream. The attention-based spatial RGB stream learns the spatial attention regions of persons that have high probability to be action region through soft-attention mechanism. The temporal stream employs optical flow as input to extract temporal features. The local spatial stream learns spatial local features using block images as input. Experimental results demonstrate the effectiveness and reliability of the proposed method on three violent interactive datasets: hockey fights, movies, violent interaction. We also verify the proposed method on our own elevator surveillance video dataset and the performance of the proposed method is satisfied.https://doi.org/10.1177/0020294020902788 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hongchang Li Jing Wang Jianjun Han Jinmin Zhang Yushan Yang Yue Zhao |
spellingShingle |
Hongchang Li Jing Wang Jianjun Han Jinmin Zhang Yushan Yang Yue Zhao A novel multi-stream method for violent interaction detection using deep learning Measurement + Control |
author_facet |
Hongchang Li Jing Wang Jianjun Han Jinmin Zhang Yushan Yang Yue Zhao |
author_sort |
Hongchang Li |
title |
A novel multi-stream method for violent interaction detection using deep learning |
title_short |
A novel multi-stream method for violent interaction detection using deep learning |
title_full |
A novel multi-stream method for violent interaction detection using deep learning |
title_fullStr |
A novel multi-stream method for violent interaction detection using deep learning |
title_full_unstemmed |
A novel multi-stream method for violent interaction detection using deep learning |
title_sort |
novel multi-stream method for violent interaction detection using deep learning |
publisher |
SAGE Publishing |
series |
Measurement + Control |
issn |
0020-2940 |
publishDate |
2020-05-01 |
description |
Violent interaction detection is a hot topic in computer vision. However, the recent research works on violent interaction detection mainly focus on the traditional hand-craft features, and does not make full use of the research results of deep learning in computer vision. In this paper, we propose a new robust violent interaction detection framework based on multi-stream deep learning in surveillance scene. The proposed approach enhances the recognition performance of violent action in video by fusing three different streams: attention-based spatial RGB stream, temporal stream, and local spatial stream. The attention-based spatial RGB stream learns the spatial attention regions of persons that have high probability to be action region through soft-attention mechanism. The temporal stream employs optical flow as input to extract temporal features. The local spatial stream learns spatial local features using block images as input. Experimental results demonstrate the effectiveness and reliability of the proposed method on three violent interactive datasets: hockey fights, movies, violent interaction. We also verify the proposed method on our own elevator surveillance video dataset and the performance of the proposed method is satisfied. |
url |
https://doi.org/10.1177/0020294020902788 |
work_keys_str_mv |
AT hongchangli anovelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT jingwang anovelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT jianjunhan anovelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT jinminzhang anovelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT yushanyang anovelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT yuezhao anovelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT hongchangli novelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT jingwang novelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT jianjunhan novelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT jinminzhang novelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT yushanyang novelmultistreammethodforviolentinteractiondetectionusingdeeplearning AT yuezhao novelmultistreammethodforviolentinteractiondetectionusingdeeplearning |
_version_ |
1724486051057106944 |