Article ID Journal Published Year Pages File Type
474861 Computers & Operations Research 2007 15 Pages PDF
Abstract

The decision tree (DT) induction process has two major phases: the growth phase and the pruning phase. The pruning phase aims to generalize the DT that was generated in the growth phase by generating a sub-tree that avoids over-fitting to the training data. Most post-pruning methods essentially address post-pruning as if it were a single objective problem (i.e. maximize validation accuracy), and address the issue of simplicity (in terms of the number of leaves) only in the case of a tie. However, it is well known that apart from accuracy there are other performance measures (e.g. stability, simplicity, interpretability) that are important for evaluating DT quality. In this paper, we propose that multi-objective evaluation be done during the post-pruning phase in order to select the best sub-tree, and propose a procedure for obtaining the optimal sub-tree based on user provided preference and value function information.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
,