کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
385842 660873 2011 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Research of fast SOM clustering for text information
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Research of fast SOM clustering for text information
چکیده انگلیسی

The state-of-the-art text clustering methods suffer from the huge size of documents with high-dimensional features. In this paper, we studied fast SOM clustering technology for Text Information. Our focus is on how to enhance the efficiency of text clustering system whereas high clustering qualities are also kept. To achieve this goal, we separate the system into two stages: offline and online. In order to make text clustering system more efficient, feature extraction and semantic quantization are done offline. Although neurons are represented as numerical vectors in high-dimension space, documents are represented as collections of some important keywords, which is different from many related works, thus the requirement for both time and space in the offline stage can be alleviated. Based on this scenario, fast clustering techniques for online stage are proposed including how to project documents onto output layers in SOM, fast similarity computation method and the scheme of Incremental clustering technology for real-time processing, We tested the system using different datasets, the practical performance demonstrate that our approach has been shown to be much superior in clustering efficiency whereas the clustering quality are comparable to traditional methods.

Research highlights
► We separate the system into two stages: offline and online.
► Feature extraction and semantic quantization are done offline.
► Documents are represented as collections of some important keywords.
► Clustering efficiency is superior whereas clustering quality are comparable to traditional methods.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 38, Issue 8, August 2011, Pages 9325–9333
نویسندگان
, , ,