کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1869308 1530991 2012 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Chinese Text Clustering Algorithm Based k-means
موضوعات مرتبط
مهندسی و علوم پایه فیزیک و نجوم فیزیک و نجوم (عمومی)
پیش نمایش صفحه اول مقاله
Chinese Text Clustering Algorithm Based k-means
چکیده انگلیسی

Text clustering is an important means and method in text mining. The process of Chinese text clustering based on k-means was emphasized, we found that new center of a cluster was easily effected by isolated text after some experiments. Average similarity of one cluster was used as a parameter, and multiplied it with a modulus between 0.75 and 1.25 to get the similarity threshold value, the texts whose similarity with original cluster center was greater than or equal to the threshold value ware collected as a candidate collection, then updated the cluster center with center of candidate collection. The experiments show that improved method averagely increased purity and F value about 10 percent over the original method.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Physics Procedia - Volume 33, 2012, Pages 301-307