Support vector machines with adaptive LqLq penalty

Article ID	Journal	Published Year	Pages	File Type
415565	Computational Statistics & Data Analysis	2007	15 Pages	PDF

Abstract

The standard support vector machine (SVM) minimizes the hinge loss function subject to the L2L2 penalty or the roughness penalty. Recently, the L1L1 SVM was suggested for variable selection by producing sparse solutions [Bradley, P., Mangasarian, O., 1998. Feature selection via concave minimization and support vector machines. In: Shavlik, J. (Ed.), ICML’98. Morgan Kaufmann, Los Altos, CA; Zhu, J., Hastie, T., Rosset, S., Tibshirani, R., 2003. 1-norm support vector machines. Neural Inform. Process. Systems 16]. These learning methods are non-adaptive since their penalty forms are pre-determined before looking at data, and they often perform well only in a certain type of situation. For instance, the L2L2 SVM generally works well except when there are too many noise inputs, while the L1L1 SVM is more preferred in the presence of many noise variables. In this article we propose and explore an adaptive learning procedure called the LqLq SVM, where the best q>0q>0 is automatically chosen by data. Both two- and multi-class classification problems are considered. We show that the new adaptive approach combines the benefit of a class of non-adaptive procedures and gives the best performance of this class across a variety of situations. Moreover, we observe that the proposed LqLq penalty is more robust to noise variables than the L1L1 and L2L2 penalties. An iterative algorithm is suggested to solve the LqLq SVM efficiently. Simulations and real data applications support the effectiveness of the proposed procedure.

Keywords

Variable selection Shrinkage Classification Support vector machine Adaptive penalty