Article ID Journal Published Year Pages File Type
4960537 Procedia Computer Science 2017 8 Pages PDF
Abstract

Data mining to discover patterns and aid decisions is the key to utilizing massive data for process automation and optimization. An especially challenging data mining problem is kriging, i.e., prediction of multiple, related variables from latent patterns in the data. We present a manifold based machine learning approach to discover patterns in massive, correlated, high-dimensional data. Dimensionality reduction using a manifold is a type of non-linear principal component analysis (PCA). The manifold captures the underlying data structure of the inputs and corresponding outputs by way of projecting the data onto a set of basis functions defined by the manifold. These bases ensure that any future adjustments affect the model with respect to the natural geometry of the data. We chose the manifold learning technique for its robustness against unbalanced data. Our contribution, described in this paper, enables interactive learning and incremental learning, i.e., incremental adjustment of the manifold (and its predictions) based on new observations and also user corrections to the predicted values, rerun the analysis on the full data set. Our experiments demonstrate that prediction performance remains equivalent to Multi-kernel Gaussian Processes on standard data sets despite these practically useful enhancements.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, ,