Article ID Journal Published Year Pages File Type
410883 Neurocomputing 2006 8 Pages PDF
Abstract

In text management tasks, the dimensionality reduction becomes necessary to computation and interpretability of the results generated by machine learning algorithms. This paper describes a feature extraction method called semantic mapping. Semantic mapping, sparse random mapping and PCA are applied to self-organization of document collections using self-organizing map (SOM). The behaviors of the methods on projection of binary and tfidf document vector representations are compared. The classification error generated by SOM maps on text categorization of the K1 collection was used to compare the performance of the methods. Semantic mapping generated better document representation than sparse random mapping.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,