Location-aware fine-grained vehicle type recognition using multi-task deep networks

Article ID	Journal	Published Year	Pages	File Type
4947498	Neurocomputing	2017	33 Pages	PDF

Abstract

Distinct from conventional approaches to generic vehicle type recognition, this paper proposes an approach to fine-grained vehicle type recognition that leverages multiple cues jointly to convolutional neural networks (CNNs). Fine-grained vehicle recognition is inherently difficult with many challenges, including (i) how to incorporate multiple cues through joint optimization instead of separately handling, (ii) how to learn data-driven representation for vehicles instead of hand-crafted features and (iii) how to perceive and localize the vehicle region instead of using the whole image. Our multi-task CNNs localize vehicles in the first stage and recognize subclasses in the second stage, allowing us to handle samples with cluttered backgrounds and those where vehicles do not fill most of the image. Through collaborative feature learning from multiple tasks in each stage, our approach can handle subtle inter-class variations at the subordinate level. We also develop new vehicle-oriented data augmentation strategies in CNN training. To advance research on vehicle-centered tasks, we release a new vehicle dataset that provides both semantic labels and bounding boxes. This dataset can be used in vehicle localization, recognition, viewpoint estimation, and so on. Extensive experiments on two benchmark vehicle datasets demonstrate that our approach outperforms state-of-the-art algorithms.

Keywords

Vehicle localization