کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
527756 869355 2013 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Using objective ground-truth labels created by multiple annotators for improved video classification: A comparative study
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Using objective ground-truth labels created by multiple annotators for improved video classification: A comparative study
چکیده انگلیسی


• We present a strategy to create objectively labeled ground-truth set of videos.
• Such a strategy mitigates the subjective biases of the annotation process.
• Objective labels show superior consistency than subjective labels.
• A classifier is trained to predict objective labels for 51 K unlabeled videos.

We address the problem of predicting category labels for unlabeled videos in a large video dataset by using a ground-truth set of objectively labeled videos that we have created. Large video databases like YouTube require that a user uploading a new video assign to it a category label from a prescribed set of labels. Such category labeling is likely to be corrupted by the subjective biases of the uploader. Despite their noisy nature, these subjective labels are frequently used as gold standard in algorithms for multimedia classification and retrieval. Our goal in this paper is NOT to propose yet another algorithm that predicts labels for unseen videos based on the subjective ground-truth. On the other hand, our goal is to demonstrate that the video classification performance can be improved if instead of using subjective labels, we first create an objectively labeled ground-truth set of videos and then train a classifier based on such a ground-truth so as to predict objective labels for the set of unlabeled videos.With regard to how we generate the objectively-labeled ground-truth dataset, we base it on the notion that when a video is labeled by a panel of diverse individuals, the majority opinion rendered by the panel may be taken to be the objective opinion. In this manner, using judgments provided by multiple human annotators, we have collected objective labels for a ground-truth dataset consisting of randomly-selected 1000 videos from the TinyVideos database that contains roughly 52,000 videos from YouTube (courtesy of Karpenko and Aarabi [1]).Through a fourfold cross-validation experiment on the ground-truth set, we demonstrate that the objective labels have a superior consistency compared to the subjective labels when used for video classification. We show that this claim is valid for several different kinds of feature sets that one can use to compare videos and with two different types of classifiers that one can use for label prediction.Subsequently, we use the ground-truth dataset of 1000 videos to predict the objective category labels of the remaining 51,000 videos. We compare the objective labels thus determined with the subjective labels provided by the video uploaders and qualitatively argue for the more informative nature of the objective labels.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Vision and Image Understanding - Volume 117, Issue 10, October 2013, Pages 1384–1399
نویسندگان
, , , ,