Article ID Journal Published Year Pages File Type
1153318 Statistics & Probability Letters 2007 7 Pages PDF
Abstract
Many data mining algorithms can only deal with discrete data or have a better performance on discrete data; however, for some technological reasons often we can only obtain the continuous value in the real world. Therefore, discretization has played an important role in data mining. Discretization is defined as the process of mapping the continuous attribute space into the discrete space, namely, using integer values or symbols to represent the continuous spaces. In this paper, we proposed a discretization method on the basis of a Univariate Marginal Distribution Algorithm (UMDA). The UMDA is a combination of statistics learning theory and Evolution Algorithms. The fitness function of the UMDA not only took the accuracy of the classifier into account, but also the number of breakpoints. Experimental results showed that the algorithm proposed in this paper could effectively reduce the number of breakpoints, and at the same time, improve the accuracy of the classifier.
Related Topics
Physical Sciences and Engineering Mathematics Statistics and Probability
Authors
, , , ,