Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos
Foreground/background (fg/bg) classification is an important first step for several video analysis tasks such as people counting, activity recognition and anomaly detection. As is the case for several other Computer Vision problems, the advent of deep Convolutional Neural Network (CNN) methods has l...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8888275/ |
id |
doaj-ed83bdb01b82448ea70838e7bfb9f543 |
---|---|
record_format |
Article |
spelling |
doaj-ed83bdb01b82448ea70838e7bfb9f5432021-03-30T00:19:15ZengIEEEIEEE Access2169-35362019-01-01715832615833710.1109/ACCESS.2019.29506398888275Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene VideosYe Tao0https://orcid.org/0000-0002-3954-828XZhihao Ling1Ioannis Patras2Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology, Shanghai, ChinaKey Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology, Shanghai, ChinaSchool of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K.Foreground/background (fg/bg) classification is an important first step for several video analysis tasks such as people counting, activity recognition and anomaly detection. As is the case for several other Computer Vision problems, the advent of deep Convolutional Neural Network (CNN) methods has led to major improvements in this field. However, despite their success, CNN-based methods have difficulties in coping with multi-scene videos where the scenes change multiple times along the time sequence. In this paper, we propose a deep features fusion network based foreground segmentation method (DFFnetSeg), which is both robust to scene changes and unseen scenes comparing with competitive state-of-the-art methods. In the heart of DFFnetSeg lies a fusion network that takes as input deep features extracted from a current frame, a previous frame, and a reference frame and produces as output a segmentation mask into background and foreground objects. We show the advantages of using a fusion network and the three frames group in dealing with the unseen scene and bootstrap challenge. In addition, we show that a simple reference frame updating strategy enables DFFnetSeg to be robust to sudden scene changes inside video sequences and prepare a motion map based post-processing method which further reduces false positives. Experimental results on the test dataset generated from CDnet2014 and Lasiesta demonstrate the advantages of the DFFnetSeg method.https://ieeexplore.ieee.org/document/8888275/Convolutional neural networkforeground segmentationmulti-scene videos aware |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ye Tao Zhihao Ling Ioannis Patras |
spellingShingle |
Ye Tao Zhihao Ling Ioannis Patras Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos IEEE Access Convolutional neural network foreground segmentation multi-scene videos aware |
author_facet |
Ye Tao Zhihao Ling Ioannis Patras |
author_sort |
Ye Tao |
title |
Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos |
title_short |
Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos |
title_full |
Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos |
title_fullStr |
Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos |
title_full_unstemmed |
Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene Videos |
title_sort |
universal foreground segmentation based on deep feature fusion network for multi-scene videos |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Foreground/background (fg/bg) classification is an important first step for several video analysis tasks such as people counting, activity recognition and anomaly detection. As is the case for several other Computer Vision problems, the advent of deep Convolutional Neural Network (CNN) methods has led to major improvements in this field. However, despite their success, CNN-based methods have difficulties in coping with multi-scene videos where the scenes change multiple times along the time sequence. In this paper, we propose a deep features fusion network based foreground segmentation method (DFFnetSeg), which is both robust to scene changes and unseen scenes comparing with competitive state-of-the-art methods. In the heart of DFFnetSeg lies a fusion network that takes as input deep features extracted from a current frame, a previous frame, and a reference frame and produces as output a segmentation mask into background and foreground objects. We show the advantages of using a fusion network and the three frames group in dealing with the unseen scene and bootstrap challenge. In addition, we show that a simple reference frame updating strategy enables DFFnetSeg to be robust to sudden scene changes inside video sequences and prepare a motion map based post-processing method which further reduces false positives. Experimental results on the test dataset generated from CDnet2014 and Lasiesta demonstrate the advantages of the DFFnetSeg method. |
topic |
Convolutional neural network foreground segmentation multi-scene videos aware |
url |
https://ieeexplore.ieee.org/document/8888275/ |
work_keys_str_mv |
AT yetao universalforegroundsegmentationbasedondeepfeaturefusionnetworkformultiscenevideos AT zhihaoling universalforegroundsegmentationbasedondeepfeaturefusionnetworkformultiscenevideos AT ioannispatras universalforegroundsegmentationbasedondeepfeaturefusionnetworkformultiscenevideos |
_version_ |
1724188499392856064 |