کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533478 870118 2012 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A semi-supervised fuzzy clustering algorithm applied to gene expression data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
A semi-supervised fuzzy clustering algorithm applied to gene expression data
چکیده انگلیسی

Over the last decade there has been an increasing interest in semi-supervised clustering. Several studies have suggested that even a small amount of supervised information can significantly improve the results of unsupervised learning. One popular method of incorporating partial supervised information is through pair-wise constraints indicating whether a certain pair of patterns should belong to the same (Must-link) or different (Dont-link) clusters. In this study we propose a novel semi-supervised fuzzy clustering algorithm (SSFCA). The supervised information is incorporated via a method quantifying Must-link and/or Dont-link constraints. Additionally, we present an extension of SSFCA that allows the algorithm to automatically detect the number of clusters in the data. We apply SSFCA to the intrinsic problem of gene expression profiles clustering. The advantageous properties of fuzzy logic, inherited to SSFCA, allow genes to belong to more than one group, revealing this way more profound information concerning their multiple functioning roles. Finally, we investigate the incorporation of prior biological knowledge arriving from Gene Ontology in the process of selecting pair-wise constraints. Simulations on artificial and real life datasets proved that the proposed SSFCA significantly outperformed other standard and semi-supervised clustering methods.


► We present a semi-supervised fuzzy clustering algorithm named SSFCA.
► Algorithm can automatically detect the number of clusters.
► Supervision is provided via pair-wise constraints.
► We apply SSFCA in the intrinsic problem of gene expression profiles clustering.
► External sources of information such as gene ontology is used to provide constraints.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 45, Issue 1, January 2012, Pages 637–648
نویسندگان
,