Efficient visual attention based framework for extracting key frames from videos

Article ID	Journal	Published Year	Pages	File Type
537679	Signal Processing: Image Communication	2013	11 Pages	PDF

Abstract

The huge amount of video data on the internet requires efficient video browsing and retrieval strategies. One of the viable solutions is to provide summaries of the videos in the form of key frames. The video summarization using visual attention modeling has been used of late. In such schemes, the visually salient frames are extracted as key frames on the basis of theories of human attention modeling. The visual attention modeling schemes have proved to be effective in video summarization. However, the high computational costs incurred by these techniques limit their applicability in practical scenarios. In this context, this paper proposes an efficient visual attention model based key frame extraction method. The computational cost is reduced by using the temporal gradient based dynamic visual saliency detection instead of the traditional optical flow methods. Moreover, for static visual saliency, an effective method employing discrete cosine transform has been used. The static and dynamic visual attention measures are fused by using a non-linear weighted fusion method. The experimental results indicate that the proposed method is not only efficient, but also yields high quality video summaries.

► A visual attention based framework for video summarization is proposed. ► Static and dynamic visual attention clues are computed and combined. ► The framework is efficient and suitable for real time applications.

Keywords

Key frame extraction Visual saliency Video summarization Visual attention model