Article ID Journal Published Year Pages File Type
404114 Knowledge-Based Systems 2008 7 Pages PDF
Abstract

Frequent itemset discovery in data streams is of great importance in different applications. However, the setting of a minimum support threshold is typically hard sometimes. It is interesting for users to find the most K significant n-itemsets (1 ⩽ n ⩽ N) in sliding windows over a data stream. In this paper, algorithms TOPSIS are presented to address this problem. A sliding window is divided into several sub-windows which are called basic windows serving as updating units. Two new data structures TOPS-Tree and RELIS are devised to maintain potential significant itemsets and mined results, respectively. Moreover, three optimal strategies are exploited to reduce time and space consumption of the algorithms: (1) pruning impotent nodes in the newest basic window whenever a sliding window updates, (2) promoting support threshold during mining process dynamically, and (3) adjusting potency parameter adaptively. The precision of the algorithm are also analyzed. Experimental studies are performed to evaluate the good effectiveness and high efficiency of the algorithm.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,