کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
527286 | 869310 | 2016 | 14 صفحه PDF | دانلود رایگان |
• We proposed a novel framework for Relevance Feedback based on the Fisher Kernel.
• The Fisher Kernel representation makes possible to capture temporal variation by using frame-based features.
• We experiment on a high variety of scenarios and public datasets (genre classification - Blip10000, action recognition - UCF50 / UCF101 and daily activities recognition - ADL) and show the benefits of the proposed approach which outperforms other state of the art approaches.
• We prove the generalization power of our approach, i.e., the framework is not dependent on a particular type of content descriptors (experiments were made with text, visual and audio features).
This paper proposes a novel framework for Relevance Feedback based on the Fisher Kernel (FK). Specifically, we train a Gaussian Mixture Model (GMM) on the top retrieval results (without supervision) and use this to create a FK representation, which is therefore specialized in modelling the most relevant examples. We use the FK representation to explicitly capture temporal variation in video via frame-based features taken at different time intervals. While the GMM is being trained, a user selects from the top examples those which he is looking for. This feedback is used to train a Support Vector Machine on the FK representation, which is then applied to re-rank the top retrieved results. We show that our approach outperforms other state-of-the-art relevance feedback methods. Experiments were carried out on the Blip10000, UCF50, UCF101 and ADL standard datasets using a broad range of multi-modal content descriptors (visual, audio, and text).
Journal: Computer Vision and Image Understanding - Volume 143, February 2016, Pages 38–51