دانلود رایگان مقاله: یک رویکرد منفی کاذب به مجموعه های مستهجن معدن از جریان داده های جریان معاملات سریع

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
395764	666012	2015	30 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

A false negative approach to mining frequent itemsets from high speed transactional data streams

ترجمه فارسی عنوان

یک رویکرد منفی کاذب به مجموعه های مستهجن معدن از جریان داده های جریان معاملات سریع

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

جریان داده ها، معدن الگوی مکرر، به حداقل رساندن حافظه

Data stream - جریان داده ها Frequent pattern mining - معماری مکرر

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

یک رویکرد منفی کاذب به مجموعه های مستهجن معدن از جریان داده های جریان معاملات سریع

چکیده انگلیسی

Mining frequent itemsets from transactional data streams is challenging due to the nature of the exponential explosion of itemsets and the limit memory space required for mining frequent itemsets. Given a domain of I unique items, the possible number of itemsets can be up to 2I − 1. When the length of data streams approaches to a very large number N, the possibility of an itemset to be frequent becomes larger and difficult to track with limited memory. The existing studies on finding frequent items from high speed data streams are false-positive oriented. That is, they control memory consumption in the counting processes by an error parameter ϵ, and allow items with support below the specified minimum support s but above s − ϵ counted as frequent ones. However, such false-positive oriented approaches cannot be effectively applied to frequent itemsets mining for two reasons. First, false-positive items found increase the number of false-positive frequent itemsets exponentially. Second, minimization of the number of false-positive items found, by using a small ϵ, will make memory consumption large. Therefore, such approaches may make the problem computationally intractable with bounded memory consumption. In this paper, we developed algorithms that can effectively mine frequent item(set)s from high speed transactional data streams with a bound of memory consumption. Our algorithms are based on Chernoff bound in which we use a running error parameter to prune item(set)s and use a reliability parameter to control memory. While our algorithms are false-negative oriented, that is, certain frequent itemsets may not appear in the results, the number of false-negative itemsets can be controlled by a predefined parameter so that desired recall rate of frequent itemsets can be guaranteed. Our extensive experimental studies show that the proposed algorithms have high accuracy, require less memory, and consume less CPU time. They significantly outperform the existing false-positive algorithms.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 176, Issue 14, 22 July 2006, Pages 1986–2015

نویسندگان

Jeffrey Xu Yu, Zhihong Chong, Hongjun Lu, Zhenjie Zhang, Aoying Zhou,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : یک رویکرد منفی کاذب به مجموعه های مستهجن معدن از جریان داده های جریان معاملات سریع

دسترسی سریع

ارتباط

English Website