کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
529149 869632 2015 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Tensor rank selection for multimedia analysis
ترجمه فارسی عنوان
انتخاب تانسور برای تحلیل چندرسانه ای
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We propose a novel tensor BOW model which can represent spatial structure information of multimedia.
• We propose a new tensor-based framework which can effectively reveal the discriminative knowledge along each order of tensor.
• The rank of tensor representation can be selected automatically.
• Two types of vector-based algorithms are extended to their tensor counterparts.
• We compare the proposed algorithms with state-of-the-art methods on three multimedia applications.

Tensors representations are widely used in multimedia applications. As a key step of tensor processing, the rank-1 tensor decomposition (i.e., the CANDECOMP/PARAFAC (CP) decomposition) always requires the estimation of the tensor rank. The ℓ2,1ℓ2,1-norm has been shown to be effective for tensor rank selection. The existing tensor rank selection algorithm force the same columns of the tensor matrices to simultaneously become zero. However, the real sparse columns for different factor matrices may be different. Such strategy does not really uncover the sparse information of each factor matrix. In this paper, we add a separable ℓ2,1ℓ2,1-norm on multiple factor matrices to obtain real sparse results along to different modes. And then different sparse results are assembled into a joint sparse pattern for tensor rank selection. This added separable regularization term has twofold role in enhancing the effect of regularization for each factor matrix and fully utilizing the knowledge of multiple factor matrices to facilitate decision making. In order to effectively exploit the structure information of multimedia data, we propose a model of tensor bag of words (tBOW) as the direct input of our algorithms. In the experiments, we apply the proposed algorithms to three representative tasks of multimedia analysis, i.e., image classification, video action recognition, and head pose estimation. Experimental results on three open benchmark datasets show that our algorithms are effective to multimedia analysis.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Visual Communication and Image Representation - Volume 30, July 2015, Pages 376–392
نویسندگان
, , ,