Article ID Journal Published Year Pages File Type
6864627 Neurocomputing 2018 25 Pages PDF
Abstract
This paper investigates how to integrate the complementary information from RGB and thermal (RGB-T) sources for object tracking. We propose a novel Convolutional Neural Network (ConvNet) architecture, including a two-stream ConvNet and a FusionNet, to achieve adaptive fusion of different source data for robust RGB-T tracking. Both RGB and thermal streams extract generic semantic information of the target object. In particular, the thermal stream is pre-trained on the ImageNet dataset to encode rich semantic information, and then fine-tuned using thermal images to capture the specific properties of thermal information. For adaptive fusion of different modalities while avoiding redundant noises, the FusionNet is employed to select most discriminative feature maps from the outputs of the two-stream ConvNet, and updated online to adapt to appearance variations of the target object. Finally, the object locations are efficiently predicted by applying the multi-channel correlation filter on the fused feature maps. Extensive experiments on the recently public benchmark GTOT verify the effectiveness of the proposed approach against other state-of-the-art RGB-T trackers.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,