Article ID Journal Published Year Pages File Type
489615 Procedia Computer Science 2015 10 Pages PDF
Abstract

As the information society rapidly develops, there is an increased importance placed of dealing with high dimension low sample size (HDLSS) data, whose number of variables is much larger than the number of objects. Moreover, the selection of effective variables for HDLSS data is becoming more crucial. In this paper, a variable selection method considering cluster loading for labeled HDLSS data is proposed. Related to cluster loading, the conventional model considering principal component analysis has been proposed. However, the model can not be used for HDLSS data. Therefore, we propose a cluster loading model using a clustering result. By using the obtained cluster loading, we can select variables which belong to clusters unrelated with the given discrimination information represented by the labels of objects. Several numerical examples show a better performance of the proposed method.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)