Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
7561946 | Chemometrics and Intelligent Laboratory Systems | 2018 | 9 Pages |
Abstract
The fields of chemoinformatics and chemometrics require regression models with high prediction performance. To construct predictive regression models by appropriately detecting outlier samples, a new outlier detection and regression method based on ensemble learning is proposed. Multiple regression models are constructed and y-values are estimated based on ensemble learning. Outlier samples are then detected by comprehensively considering all regression models. Furthermore, it is possible to detect outlier samples robustly and independently by repeated calculations. By analyzing a numerical simulation dataset, two quantitative structure-activity relationship datasets and two quantitative structure-property relationship datasets, it is confirmed that automatic outlier sample detection can be achieved, informative compounds can be selected, and the estimation performance of regression models is improved.
Related Topics
Physical Sciences and Engineering
Chemistry
Analytical Chemistry
Authors
Hiromasa Kaneko,