Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4947537 | Neurocomputing | 2017 | 9 Pages |
Abstract
Identification of characters in TV series and movies is an important and challenging problem. Actor identification results are important information for many higher level multimedia analysis tasks, such as semantic indexing and retrieval, interaction analysis and video summarization. Compared with previous works on actor identification that mainly focus on static features based on face identification and costume detection, our work in this paper attempts to use abundant dynamic information contained in videos to improve the performance when the appearances of actors are hard to detect or changes greatly over time. We propose to mine representative actions of each actor, and show the remarkable power of such actions for actor identification. Videos are firstly divided into shots and represented by Bag of Words (BoW) using spatial-temporal features. Integrating the prototype theory with representativeness, we propose two novel methods to rank the shots and obtain the representative actions. Our methods for actor identification combines representative actions with actors' appearance. We validate the performance of the proposed methods on episodes of the TV series “The Big Bang Theory”. The experimental results show that the representative actions are consistent with human judgements and can greatly improve the matching performance as complementary to existing handcrafted static features for actor identification.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Wenlong Xie, Hongxun Yao, Xiaoshuai Sun, Sicheng Zhao, Wei Yu, Shengping Zhang,