کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
534705 870280 2011 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A sparse nearest mean classifier for high dimensional multi-class problems
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
A sparse nearest mean classifier for high dimensional multi-class problems
چکیده انگلیسی

The analysis of small datasets in high dimensional spaces is inherently difficult. For two-class classification problems there are a few methods that are able to face the so-called curse of dimensionality. However, for multi-class sparsely sampled datasets there are hardly any specific methods. In this paper, we propose four multi-class classifier alternatives that effectively deal with this type of data. Moreover, these methods implicitly select a feature subset optimized for class separation. Accordingly, they are especially interesting for domains where an explanation of the problem in terms of the original features is desired.In the experiments, we applied the proposed methods to an MDMA powders dataset, where the problem was to recognize the production process. It turns out that the proposed multi-class classifiers perform well, while the few utilized features correspond to known MDMA synthesis ingredients. In addition, to show the general applicability of the methods, we applied them to several other sparse datasets, ranging from bioinformatics to chemometrics datasets having as few as tens of samples in tens to even thousands of dimensions and three to four classes. The proposed methods had the best average performance, while very few dimensions were effectively utilized.

Research highlights
► A sparse model for the classification of high-dimensional datasets that uses a small number of the original dimensions.
► A true multi-class method for high-dimensional classification tasks.
► A method that generally outperforms state-of-the-art methods like linear SVM and PLS-DA for sparse high-dimensional datasets.
► Linear programming based model for which very efficient solvers exist.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 32, Issue 6, 15 April 2011, Pages 854–859
نویسندگان
, ,