کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4944197 1437980 2017 48 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Semi-supervised cross-entropy clustering with information bottleneck constraint
ترجمه فارسی عنوان
خوشه بندی انتروپی نیمه نظارت شده با محدودیت تنگنا اطلاعات
کلمات کلیدی
خوشه بندی نیمه نظارت، اطلاعات مربوط به سطح پارتیشن، خوشه بندی مبتنی بر مدل، متقاطع آنتروپی، تنگنا اطلاعات
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting goals: the accuracy with which the data set is modeled, the simplicity of the model, and the consistency of the clustering with side information. Experiments demonstrate that CEC-IB performs similar as Gaussian mixture models in a classical semi-supervised scenario, but is faster, more robust to noisy labels, automatically determines the optimal number of clusters, and performs well when not all classes are present in the side information. Moreover, in contrast to many other semi-supervised models, it can be successfully applied in discovering natural subgroups if the partition-level side information is derived from the top levels of a hierarchical clustering.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 421, December 2017, Pages 254-271
نویسندگان
, ,