Summarization of films and documentaries based on subtitles and scripts

Article ID	Journal	Published Year	Pages	File Type
536178	Pattern Recognition Letters	2016	6 Pages	PDF

Abstract

•We study the behavior of automatic summarization for films and documentaries.•Well-known extractive summarization algorithms are ranked for this task.•Assessment of strategies for effective extractive summarization in these domains.•Quantitative results are presented for relevant experiments.•Qualitative assessment is also provided (concerning the best approaches).

We assess the performance of generic text summarization algorithms applied to films and documentaries, using extracts from news articles produced by reference models of extractive summarization. We use three datasets: (i) news articles, (ii) film scripts and subtitles, and (iii) documentary subtitles. Standard ROUGE metrics are used for comparing generated summaries against news abstracts, plot summaries, and synopses. We show that the best performing algorithms are LSA, for news articles and documentaries, and LexRank and Support Sets, for films. Despite the different nature of films and documentaries, their relative behavior is in accordance with that obtained for news articles.

Keywords

Automatic text summarization