Article ID Journal Published Year Pages File Type
476894 European Journal of Operational Research 2012 9 Pages PDF
Abstract

In many industrial processes hundreds of noisy and correlated process variables are collected for monitoring and control purposes. The goal is often to correctly classify production batches into classes, such as good or failed, based on the process variables. We propose a method for selecting the best process variables for classification of process batches using multiple criteria including classification performance measures (i.e., sensitivity and specificity) and the measurement cost. The method applies Partial Least Squares (PLS) regression on the training set to derive an importance index for each variable. Then an iterative classification/elimination procedure using k-Nearest Neighbor is carried out. Finally, Pareto analysis is used to select the best set of variables and avoid excessive retention of variables. The method proposed here consistently selects process variables important for classification, regardless of the batches included in the training data. Further, we demonstrate the advantages of the proposed method using six industrial datasets.

► We select process variables to classify production batches using multiple criteria. ► The method reduces the percent of retained variables and improves the classification. ► The method is applied to six industrial datasets.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , ,