کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
527286 869310 2016 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Fisher Kernel Temporal Variation-based Relevance Feedback for video retrieval
ترجمه فارسی عنوان
فیشر کرنل تغییرات زمانی مبتنی بر بازخورد برای بازیابی فیلم یک ؟؟
کلمات کلیدی
بازخورد مربوطه نمایندگی فیشر کرنل، شرح محتوای چندجملهای، بازیابی تصویر
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We proposed a novel framework for Relevance Feedback based on the Fisher Kernel.
• The Fisher Kernel representation makes possible to capture temporal variation by using frame-based features.
• We experiment on a high variety of scenarios and public datasets (genre classification - Blip10000, action recognition - UCF50 / UCF101 and daily activities recognition - ADL) and show the benefits of the proposed approach which outperforms other state of the art approaches.
• We prove the generalization power of our approach, i.e., the framework is not dependent on a particular type of content descriptors (experiments were made with text, visual and audio features).

This paper proposes a novel framework for Relevance Feedback based on the Fisher Kernel (FK). Specifically, we train a Gaussian Mixture Model (GMM) on the top retrieval results (without supervision) and use this to create a FK representation, which is therefore specialized in modelling the most relevant examples. We use the FK representation to explicitly capture temporal variation in video via frame-based features taken at different time intervals. While the GMM is being trained, a user selects from the top examples those which he is looking for. This feedback is used to train a Support Vector Machine on the FK representation, which is then applied to re-rank the top retrieved results. We show that our approach outperforms other state-of-the-art relevance feedback methods. Experiments were carried out on the Blip10000, UCF50, UCF101 and ADL standard datasets using a broad range of multi-modal content descriptors (visual, audio, and text).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Vision and Image Understanding - Volume 143, February 2016, Pages 38–51
نویسندگان
, , , ,