کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
404590 677438 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic labelling of clusters of discrete and continuous data with supervised machine learning
ترجمه فارسی عنوان
برچسب زنی خودکار خوشه ها از داده های گسسته و پیوسته با یادگیری ماشین تحت نظارت
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• This study presents a definition of the labelling problem and a solution that is based on techniques for supervised learning, unsupervised learning and a discretisation model.
• A method with unsupervised learning is applied to the clustering problem, and a supervised learning algorithm will detect the relevant attributes to define each formed cluster.
• Some strategies are used to form a methodology that presents a label (based on attributes and values) for each provided cluster.
• Discretisation methods 226 will be used to determine the ranges of values of the attributes presented in the 227 labels.
• This methodology is applied to three different databases, in which acceptable results were achieved with an average that exceeds 92.89% of correctly labelled elements.

The clustering problem has been considered one of the most relevant problems in the research area of unsupervised learning. However, the comprehension and definition of such clusters is not a trivial task, making necessary their identification, i.e., assign a label to each cluster. To address the problem of labelling clusters, this paper presents a methodology based on techniques for supervised learning, unsupervised learning and a discretization model. Thus, a method with unsupervised learning is applied to the clustering problem, and the supervised learning algorithm is responsible for detecting the meaningful attributes to define each formed cluster. Some strategies are used to form a methodology that presents a label (based on attributes and values) for each provided cluster. Such methodology is applied to three different databases, in which acceptable results were achieved with an average that exceeds 92.89% of correctly labelled elements.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 106, 15 August 2016, Pages 231–241
نویسندگان
, , , , ,