Article ID Journal Published Year Pages File Type
7563247 Chemometrics and Intelligent Laboratory Systems 2015 7 Pages PDF
Abstract
Supervised classification, which is a fundamental classification approach for e-nose data, requires sufficient labeled data for training. However, sufficient labeled data requires extensive money, materials, energy and time. In this paper, a semi-supervised approach-Cluster-then-Label-that simultaneously uses labeled and unlabeled data to build a better classifier with fewer training data was introduced to deal with e-nose data for the first time. A novel clustering algorithm-spectral clustering-was also introduced to improve this semi-supervised approach. Three experiments-discriminating storage shelf life (SL), identifying pretreatments and authenticating juices, respectively-were conducted on cherry tomato juices using a PEN 2 e-nose, generating three datasets of different data structures. For each dataset, only 20% of data were selected for training. Classifications of the datasets by this semi-supervised approach and four supervised approaches (linear discriminant analysis (LDA), quadratic discriminant analysis, multi-class support vector machine and back propagation neural network) were compared. The results indicate that this spectral clustering based semi-supervised approach outperforms the supervised approaches in all cases. By using this semi-supervised approach, it is now possible to build reliable classifiers with only a few labeled data. It is also worth mentioning that this new approach takes no remarkable superiority over LDA. Thus, our next plan is to use more e-nose datasets for test.
Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , ,