Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6864599 | Neurocomputing | 2018 | 10 Pages |
Abstract
With rapid development of deep convolutional neural networks (CNNs), more and more high-accuracy CNN architectures have been proposed for multimedia and vision applications. Among them, Bilinear CNN (B-CNN) performs outer product on the outputs of convolutional layers of two stream CNN models, and has attracted a lot of attentions due to its effectiveness in fine-grained categorization. However, the B-CNN fails to use the information inherent in different convolutional layers, meanwhile the cost of computing and storage is higher as evaluation requirements of two CNN models. In this paper, we propose a family of Hyperlayer Bilinear Pooling (HLBP) methods to overcome the limitations of B-CNN. Evaluating a single CNN model only once, our HLBP methods can effectively exploit the second-order statistics of features of multiple layers. Besides fine-grained categorization, we also apply the proposed HLBP methods to content-based image retrieval task. Extensive experiments on seven widely used benchmarks demonstrate that our HLBP methods are superior to the counterparts (e.g., B-CNN), achieving very competitive or better performance compared to state-of-the-arts on both fine-grained categorization and content-based image retrieval tasks.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Qiule Sun, Qilong Wang, Jianxin Zhang, Peihua Li,