Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4944781 | Information Sciences | 2016 | 13 Pages |
Abstract
This paper aims to develop a framework for high dimensional data regression, where the model interpretation and prediction accuracy are regularized. Taking application background into account, we supposed that the collected samples for building learner models are expensive and limited. Our technical contributions include the generation of ensemble features (EF) using Lasso models with some selective regularizing factors estimated via a cross-validation procedure; and predictive model building using neural networks with random weights, where the weights and biases of the hidden nodes are assigned randomly in a specific interval, and the output weights are evaluated analytically by a regularized least square method. Experiments with comparisons on estimating protein content of milk from its NMR spectrum are carried out by a data set with 31,570 dimensions (spectrum size) and 120 samples. Results demonstrate that our proposed solution for data regression problems with small samples and high dimensionality is promising, and the learning system performs robustly with respect to a key parameter setting in the ensemble feature generation.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Caihao Cui, Dianhui Wang,