کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
405183 677499 2013 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Projected-prototype based classifier for text categorization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Projected-prototype based classifier for text categorization
چکیده انگلیسی

Currently, the explosive increasing of data stimulates a greater demand for text categorization. The existing prototype-based classifiers, including k-NN, kNNModel and Centroid classifier, are receiving wide interest from the text mining community because of their simplicity and efficiency. However, they usually perform less effectively on document data sets due to high dimensionality and complex class structures these sets involve. In most cases a single document category actually contains multiple subtopics, indicating that the documents in the same class may comprise multiple subclasses, each associated with its individual term subspace. In this paper, a novel projected-prototype based classifier is proposed for text categorization, in which a document category is represented by a set of prototypes, each assembling a representative for the documents in a subclass and its corresponding term subspace. In the classifier’s training process, the number of prototypes and the prototypes themselves are learned using a newly developed feature-weighting algorithm, in order to ensure that the documents belonging to different subclasses are separated as much as possible when projected onto their own subspaces. Then, in the testing process, each test document is classified in terms of its weighted distances from the different prototypes. Experimental results on the Reuters-21578 and 20-Newsgroups corpora show that the proposed classifier based on the multi-representative-dependent projection method can achieve higher classification accuracy at a lower computational cost than the conventional prototype-based classifiers, especially for data sets that include overlapping document categories.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 49, September 2013, Pages 179–189
نویسندگان
, , ,