Pre-analysis of Multi-batch Bioprocesses Data with Finite Mixture Models in the Reduced Feature Subspace

Article ID	Journal	Published Year	Pages	File Type
718506	IFAC Proceedings Volumes	2010	6 Pages	PDF

Abstract

Multi-batch bioprocesses data, unlike the data from other industries, are highly correlated due to the operation characteristics of the industry. In this work, pairwise Fisher discriminant analysis (FDA) is successfully utilized to reveal the similarity between two batches. In order to handle the mixture pattern for the data projected into the reduced feature subspace represented by the first several generalized eigenvectors, the finite Gaussian mixture model is adopted here to calculate the confidence region of each mixture. There are several challenges facing application engineers when estimate finite mixture models (FMMs), such as initialization of the expectation-maximization (EM) algorithm and determination of number of mixtures. In this work, an initialization method based on the uniform prior distribution assumption and a new method to determine the number of components of FMMs based on estimated density histogram are proposed. The utility of the proposed method has been demonstrated in simulation studies. Combined with the pairwise FDA, the method has been successfully applied to a large scale multi-batch bioprocess data set.