Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
392683 | Information Sciences | 2014 | 15 Pages |
Abstract
One of the most popular tools for mining data streams are decision trees. In this paper we propose a new algorithm, which is based on the commonly known CART algorithm. The most important task in constructing decision trees for data streams is to determine the best attribute to make a split in the considered node. To solve this problem we apply the Gaussian approximation. The presented algorithm allows to obtain high accuracy of classification, with a short processing time. The main result of this paper is the theorem showing that the best attribute computed in considered node according to the available data sample is the same, with some high probability, as the attribute derived from the whole data stream.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Leszek Rutkowski, Maciej Jaworski, Lena Pietruczuk, Piotr Duda,