کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6854760 1437594 2018 24 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Probabilistic frequent itemset mining over uncertain data streams
ترجمه فارسی عنوان
معادن اقلام احتمالی مکرر بیش از جریان داده های نامشخص
کلمات کلیدی
اقلام مکرر احتمالی، جریان اطلاعات نامشخص پایگاه داده نامعلوم، معدن داده جریان
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
This paper considers the problem of mining probabilistic frequent itemsets in the sliding window of an uncertain data stream. We design an effective in-memory index named PFIT to store the data synopsis, so the current probabilistic frequent itemsets can be output in real time. We also propose a depth-first algorithm, PFIMoS, to bottom-up build and maintain the PFIT dynamically. Because computing the probabilistic support is time consuming, we propose a method to estimate the range of probabilistic support by using the support and expected support, which can greatly reduce the runtime and memory usage. Nevertheless, massive probabilistic supports have to be computed when the minimum support is low over dense data, which may result in a drastic reduction of computing speed. We further address this problem with a heuristic rule-based algorithm, PFIMoS+, in which an error parameter is introduced to decrease the probabilistic support computing count. Theoretical analysis and experimental studies demonstrate that our proposed algorithms can efficiently reduce computing time and memory, ensure fast and exact mining of probabilistic data streams, and markedly outperform the state-of-the-art algorithms TODIS-Stream (Sun et al., 2010) and FEMP (Akbarinia & Masseglia, 2013).
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 112, 1 December 2018, Pages 274-287
نویسندگان
, , , , ,