Article ID Journal Published Year Pages File Type
1180508 Chemometrics and Intelligent Laboratory Systems 2015 8 Pages PDF
Abstract

•We propose to use the fused lasso logistic regression (FLLR) to classify the spectral data.•We show that the FLLR simultaneously selects/disselects a group of highly correlated variables together as significant ones.•The FLLR can resolve the well-known peak mis-alignment problem of the spectral data by providing data dependent binning.•The FLLR also provides a better interpretable classifier than other ℓ1 regularization methods.•The advantages of the FLLR over other ℓ1 regularized methods are illustrated with the mass spectral data of herbal medicines.

Spectral data contain powerful information that can be used to identify unknown compounds and their chemical structures. In this paper, we study fused lasso logistic regression (FLLR) to classify the spectral data into two groups. We show that the FLLR has a grouping property on regression coefficients, which simultaneously selects a group of highly correlated variables together. Both the sparsity and the grouping property of the FLLR provide great advantages in the analysis of the spectral data. In particular, it resolves the well-known peak misalignment problem of the spectral data by providing data dependent binning, and provides a better interpretable classifier than other ℓ1-regularization methods. We also analyze the gas chromatography/mass spectrometry data to classify the origin of herbal medicines, and illustrate the advantages of the FLLR over other existing ℓ1-regularized methods.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , , ,