Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1249200 | TrAC Trends in Analytical Chemistry | 2012 | 10 Pages |
Large amounts of data from high-throughput analytical instruments have generally become more and more complex, bringing a number of challenges to statistical modeling. To understand complex data further, new statistically-efficient approaches are urgently needed to:(1)select salient features from the data;(2)discard uninformative data;(3)detect outlying samples in data;(4)visualize existing patterns of the data;(5)improve the prediction accuracy of the data; and, finally,(6)feed back to the analyst understandable summaries of information from the data.We review current developments in tree-based ensemble methods to mine effectively the knowledge hidden in chemical and biology data. We report on applications of these algorithms to variable selection, outlier detection, supervised pattern analysis, cluster analysis, and tree-based kernel and ensemble learning.Through this report, we wish to inspire chemists to take greater interest in decision trees and to obtain greater benefits from using the tree-based ensemble techniques.
► Decision tree can do automatic stepwise variable selection and complexity reduction. ► Decision trees can cope effectively with complex chemical data. ► Tree-based ensemble approaches could be applied to solve various chemometric problems.