Article ID Journal Published Year Pages File Type
416946 Computational Statistics & Data Analysis 2011 18 Pages PDF
Abstract

Ordering of regression or classification coefficients occurs in many real-world applications. Fused Lasso exploits this ordering by explicitly regularizing the differences between neighboring coefficients through an ℓ1ℓ1 norm regularizer. However, due to nonseparability and nonsmoothness of the regularization term, solving the fused Lasso problem is computationally demanding. Existing solvers can only deal with problems of small or medium size, or a special case of the fused Lasso problem in which the predictor matrix is the identity matrix. In this paper, we propose an iterative algorithm based on the split Bregman method to solve a class of large-scale fused Lasso problems, including a generalized fused Lasso and a fused Lasso support vector classifier. We derive our algorithm using an augmented Lagrangian method and prove its convergence properties. The performance of our method is tested on both artificial data and real-world applications including proteomic data from mass spectrometry and genomic data from array comparative genomic hybridization (array CGH). We demonstrate that our method is many times faster than the existing solvers, and show that it is especially efficient for large pp, small nn problems, where pp is the number of variables and nn is the number of samples.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,