Video saliency detection via bagging-based prediction and spatiotemporal propagation

Article ID	Journal	Published Year	Pages	File Type
6938317	Journal of Visual Communication and Image Representation	2018	13 Pages	PDF

Abstract

The task of spatiotemporal saliency detection is to distinguish the salient objects from background across all the frames in the video. Although many spatiotemporal models have been designed from various aspects, it is still a very challenging task for handing the unconstrained videos with complicated motions and complex scenes. Therefore, in this paper we propose a novel spatiotemporal saliency model to estimate salient objects in unconstrained videos. Specifically, a bagging-based saliency prediction model, i.e. an ensembling regressor, which is the combination of random forest regressors learned from undersampled training sets, is first used to perform saliency prediction for each current frame. Then, both forward and backward propagation within a local temporal window are deployed on each current frame to make a complement to the predicted saliency map and yield the temporal saliency map, in which the backward propagation is constructed based on the temporary saliency estimation of the following frames. Finally, by building the appearance and motion based graphs in a parallel way, spatial propagation is employed over the temporal saliency map to generate the final spatiotemporal saliency map. Through experiments on two challenging datasets, the proposed model consistently outperforms the state-of-the-art models for popping out salient objects in unconstrained videos.

Keywords

propagation Bagging Prediction