Article ID Journal Published Year Pages File Type
392683 Information Sciences 2014 15 Pages PDF
Abstract

One of the most popular tools for mining data streams are decision trees. In this paper we propose a new algorithm, which is based on the commonly known CART algorithm. The most important task in constructing decision trees for data streams is to determine the best attribute to make a split in the considered node. To solve this problem we apply the Gaussian approximation. The presented algorithm allows to obtain high accuracy of classification, with a short processing time. The main result of this paper is the theorem showing that the best attribute computed in considered node according to the available data sample is the same, with some high probability, as the attribute derived from the whole data stream.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,