کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1134987 956084 2012 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Towards a variable size sliding window model for frequent itemset mining over data streams
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مهندسی صنعتی و تولید
پیش نمایش صفحه اول مقاله
Towards a variable size sliding window model for frequent itemset mining over data streams
چکیده انگلیسی

Sliding window is a widely used model for data stream mining due to its emphasis on recent data and its bounded memory requirement. The main idea behind a transactional sliding window is to keep a fixed size window over a data stream. The window size is kept constant by removing old transactions from the window, when new transactions arrive. Older transactions of window are removed irrespective to whether a significant change has occurred or not. Another challenge of sliding window model is determining window size. The classic approach for determining the window size is to obtain it from the user. In order to determine the precise size of the window, the user must have prior knowledge about the time and scale of changes within the data stream. However, due to the unpredictable changing nature of data streams, this prior knowledge cannot be easily determined. Moreover, by using a fixed window size during a data stream mining, the performance of this model is degraded in terms of reflecting recent changes. Based on these observations, this study relaxes the notion of window size and proposes a new algorithm named VSW (Variable Size sliding Window frequent itemset mining) which is suitable for observing recent changes in the set of frequent itemsets over data streams. The window size is determined dynamically based on amounts of concept change that occurs within the arriving data stream. The window expands as the concept becomes stable and shrinks when a concept change occurs. In this study, it is shown that if stale transactions are removed from the window after a concept change, updated frequent itemsets always belong to the most recent concept. Experimental evaluations on both synthetic and real data show that our algorithm effectively detects the concept change, adjust the window size, and adapts itself to the new concepts along the data stream.


► A novel method for both concept change detection and mining.
► Introducing two theorems with proofs about concept change detection.
► Introducing two new definitions about concept change of frequent itemsets.
► A novel window model for data stream mining in which the size of window is not fixed.
► Experimental evaluations on both real and synthetic datasets to show the effectiveness of the proposed model.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Industrial Engineering - Volume 63, Issue 1, August 2012, Pages 161–172
نویسندگان
, , ,