کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
485809 703338 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Text Clustering Based on a Divide and Merge Strategy
ترجمه فارسی عنوان
خوشه بندی متن براساس استراتژی تقسیم و ادغام
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

A text clustering algorithm is proposed to overcome the drawback of division based clustering method on sensitivity of estimated class number. Complex features including synonym and co-occurring words are extracted to make a feature space containing more semantic information. Then the divide and merge strategy helps the iteration converge to a reasonable cluster number. Experimental results showed that the dynamically updated center number prevent the deterioration of clustering result when k deviates from the real class numbers. When k is too small or large, the difference of clustering results between FC-DM and k-means is more obvious and FC-DM also outperformed other benchmark algorithms.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 55, 2015, Pages 825-832