Article ID Journal Published Year Pages File Type
416923 Computational Statistics & Data Analysis 2006 20 Pages PDF
Abstract

Emerging patterns represent a class of interaction structures which has been recently proposed as a tool in data mining. A new and more general definition referring to underlying probabilities is proposed. The defined interaction patterns (IP) carry information about the relevance of combinations of variables for distinguishing between classes. Since they are formally quite similar to the leaves of a classification tree, a fast and simple method which is based on the CART algorithm is proposed to find the corresponding empirical patterns in data sets. In simulations, it can be shown that the method is quite effective in identifying patterns. In addition, the detected patterns can be used to define new variables for classification. Thus, a simple scheme to use the patterns to improve the performance of classification procedures is proposed. The method may also be seen as a scheme to improve the performance of CARTs concerning the identification of IP as well as the accuracy of prediction.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,