Article ID Journal Published Year Pages File Type
6858183 Information Sciences 2014 14 Pages PDF
Abstract
This paper proposes a new method to trend analysis of categorical data streams. A data stream is partitioned into a sequence of time windows and the records in each window are assumed to carry a number of concepts represented as clusters. A data labeling algorithm is proposed to identify the concepts or clusters of a window from the concepts of the preceding window. The expression of a concept is presented and the distance between two concepts in two consecutive windows is defined to analyze the change of concepts in consecutive windows. Finally, a trend analysis algorithm is proposed to compute the trend of concept change in a data stream over the sequence of consecutive time windows. The methods for measuring the significance of an attribute that causes the concept change and the outlier degrees of objects are presented to reveal the causes of concept change. Experiments on real data sets are presented to demonstrate the benefits of the trend analysis method.
Keywords
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,