کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
525704 869014 2013 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection
چکیده انگلیسی

Bag-of-Words lies at a heart of modern object category recognition systems. After descriptors are extracted from images, they are expressed as vectors representing visual word content, referred to as mid-level features. In this paper, we review a number of techniques for generating mid-level features, including two variants of Soft Assignment, Locality-constrained Linear Coding, and Sparse Coding. We also isolate the underlying properties that affect their performance. Moreover, we investigate various pooling methods that aggregate mid-level features into vectors representing images. Average pooling, Max-pooling, and a family of likelihood inspired pooling strategies are scrutinised. We demonstrate how both coding schemes and pooling methods interact with each other. We generalise the investigated pooling methods to account for the descriptor interdependence and introduce an intuitive concept of improved pooling. We also propose a coding-related improvement to increase its speed. Lastly, state-of-the-art performance in classification is demonstrated on Caltech101, Flower17, and ImageCLEF11 datasets.


► We compare various mid-level feature coding methods for Bags-of-Words.
► Soft Assignment and Local Coordinate Coding families are scrutinised.
► Likelihood inspired pooling methods prove to outperform baseline Max-pooling.
► Sparse Coding with the proposed pooling outperforms other methods.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Vision and Image Understanding - Volume 117, Issue 5, May 2013, Pages 479–492
نویسندگان
, , ,