Foreground segmentation using convolutional neural networks for multiscale feature encoding

Article ID	Journal	Published Year	Pages	File Type
6940173	Pattern Recognition Letters	2018	10 Pages	PDF

Abstract

Several methods have been proposed to solve moving objects segmentation problem accurately in different scenes. However, many of them lack the ability of handling various difficult scenarios such as illumination changes, background or camera motion, camouflage effect, shadow etc. To address these issues, we propose two robust encoder-decoder type neural networks that generate multi-scale feature encodings in different ways and can be trained end-to-end using only a few training samples. Using the same encoder-decoder configurations, in the first model, a triplet of encoders take the inputs in three scales to embed an image in a multi-scale feature space; in the second model, a Feature Pooling Module (FPM) is plugged on top of a single input encoder to extract multi-scale features in the middle layers. Both models use a transposed convolutional network in the decoder part to learn a mapping from feature space to image space. In order to evaluate our models, we entered the Change Detection 2014 Challenge (changedetection.net) and our models, namely FgSegNet_M and FgSegNet_S, outperformed all the existing state-of-the-art methods by an average F-Measure of 0.9770 and 0.9804, respectively. We also evaluate our models on SBI2015 and UCSD Background Subtraction datasets. Our source code is made publicly available at https://github.com/lim-anggun/FgSegNet.

Keywords

Background subtraction Foreground segmentation Convolutional neural networks Pixel classification Video surveillance Deep learning