Article ID Journal Published Year Pages File Type
6938236 Journal of Visual Communication and Image Representation 2018 15 Pages PDF
Abstract
Tracking-by-detection (TBD) is widely used in visual object tracking. However, many TBD-based methods ignore the strong motion correlation between current and previous frames. In this work, a motion-guided convolutional neural network (MGNet) solution to online object tracking is proposed. The MGNet tracker is built upon the multi-domain convolutional neural network with two innovations: (1) a motion-guided candidate selection (MCS) scheme based on a dynamic prediction model is proposed to accurately and efficiently generate the candidate regions and (2) the spatial RGB and temporal optical flow are combined as inputs and processed in an unified end-to-end trained network, rather than a two-branch processing network. We compare the performance of the MGNet, the MDNet and several state-of-the-art online object trackers on the OTB and the VOT benchmark datasets, and demonstrate that the temporal correlation between any two consecutive frames in videos can be more effectively captured by the MGNet via extensive performance evaluation.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,