Spatiotemporal Representation Learning for Video Anomaly Detection
Video-based anomalous human behavior detection is widely studied in many fields such as security, medical care, education, and energy. However, there are still some open problems in anomalous behavior detection, such as the large and complicated model is difficult to train, the accuracy of anomalous...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8976183/ |
id |
doaj-4b73403dccb14cc59c1a3be50fac3807 |
---|---|
record_format |
Article |
spelling |
doaj-4b73403dccb14cc59c1a3be50fac38072021-03-30T02:22:23ZengIEEEIEEE Access2169-35362020-01-018255312554210.1109/ACCESS.2020.29704978976183Spatiotemporal Representation Learning for Video Anomaly DetectionZhaoyan Li0Yaoshun Li1Zhisheng Gao2https://orcid.org/0000-0002-0470-8861School of Computer and Software Engineering, Xihua University, Chengdu, ChinaSchool of Computer and Software Engineering, Xihua University, Chengdu, ChinaSchool of Computer and Software Engineering, Xihua University, Chengdu, ChinaVideo-based anomalous human behavior detection is widely studied in many fields such as security, medical care, education, and energy. However, there are still some open problems in anomalous behavior detection, such as the large and complicated model is difficult to train, the accuracy of anomalous behavior detection is not high enough and the speed is not fast enough. A spatiotemporal representation learning model is proposed in this paper. Firstly, the spatial-temporal features of the video are extracted by the constructed multi-scale 3D convolutional neural network. Then the scene background is modeled by the high-dimensional mixed Gaussian model and used for anomaly detection. Finally, the accurate position of anomalous behavior in the video data is achieved by calculating the position of the last output feature, that is, the position of the receptive field. The proposed model does not require specific training. Moreover, the proposed method has the advantages of high versatility, fast calculation speed and high detection accuracy. We validated the proposed algorithm on two representative surveillance scene datasets, the Subway and the UCSDSped2. Results show that proposed algorithm has achieved the detection rate of 18 FPS under the condition of common computing resources, and meet the real-time requirements. Moreover, compared the similar methods, the proposed method has achieved the competitive results in both frame-level accuracy and pixel-level accuracy.https://ieeexplore.ieee.org/document/8976183/Spatiotemporal representation learninganomaly detection3D convolutional neural networkmixed Gaussian model |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Zhaoyan Li Yaoshun Li Zhisheng Gao |
spellingShingle |
Zhaoyan Li Yaoshun Li Zhisheng Gao Spatiotemporal Representation Learning for Video Anomaly Detection IEEE Access Spatiotemporal representation learning anomaly detection 3D convolutional neural network mixed Gaussian model |
author_facet |
Zhaoyan Li Yaoshun Li Zhisheng Gao |
author_sort |
Zhaoyan Li |
title |
Spatiotemporal Representation Learning for Video Anomaly Detection |
title_short |
Spatiotemporal Representation Learning for Video Anomaly Detection |
title_full |
Spatiotemporal Representation Learning for Video Anomaly Detection |
title_fullStr |
Spatiotemporal Representation Learning for Video Anomaly Detection |
title_full_unstemmed |
Spatiotemporal Representation Learning for Video Anomaly Detection |
title_sort |
spatiotemporal representation learning for video anomaly detection |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Video-based anomalous human behavior detection is widely studied in many fields such as security, medical care, education, and energy. However, there are still some open problems in anomalous behavior detection, such as the large and complicated model is difficult to train, the accuracy of anomalous behavior detection is not high enough and the speed is not fast enough. A spatiotemporal representation learning model is proposed in this paper. Firstly, the spatial-temporal features of the video are extracted by the constructed multi-scale 3D convolutional neural network. Then the scene background is modeled by the high-dimensional mixed Gaussian model and used for anomaly detection. Finally, the accurate position of anomalous behavior in the video data is achieved by calculating the position of the last output feature, that is, the position of the receptive field. The proposed model does not require specific training. Moreover, the proposed method has the advantages of high versatility, fast calculation speed and high detection accuracy. We validated the proposed algorithm on two representative surveillance scene datasets, the Subway and the UCSDSped2. Results show that proposed algorithm has achieved the detection rate of 18 FPS under the condition of common computing resources, and meet the real-time requirements. Moreover, compared the similar methods, the proposed method has achieved the competitive results in both frame-level accuracy and pixel-level accuracy. |
topic |
Spatiotemporal representation learning anomaly detection 3D convolutional neural network mixed Gaussian model |
url |
https://ieeexplore.ieee.org/document/8976183/ |
work_keys_str_mv |
AT zhaoyanli spatiotemporalrepresentationlearningforvideoanomalydetection AT yaoshunli spatiotemporalrepresentationlearningforvideoanomalydetection AT zhishenggao spatiotemporalrepresentationlearningforvideoanomalydetection |
_version_ |
1724185279766462464 |