کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
391725 661934 2016 26 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Mining interesting patterns from uncertain databases
ترجمه فارسی عنوان
معادن الگوهای جالب از پایگاه داده های نامشخص
کلمات کلیدی
داده کاوی، الگوهای مکرر، پایگاه داده نامعلوم، الگوهای همبسته، الگوهای وزن
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We proposed a strategy for weighted uncertain interesting pattern mining.
• It computes expected support confidence to mine weighted correlated patterns.
• It prunes infrequent patterns early by computing prefix proxy values.
• It constructs a tree to capture prefix cap & proxy values for uncertain databases.
• Our strategy generates a manageable number of interesting patterns quickly.

Due to a growing demand for efficient algorithms for mining frequent itemsets from uncertain databases, several approaches have been proposed in recent years, but all of them use support-based constraints to prune the combinatorial search space. Most real life databases contain data whose correctness is uncertain. The support-based constraint alone is not enough, because the frequent itemsets may have weak affinity. Even a very high minimum support is not effective for finding correlated patterns with increased weight or support affinity. There are a few approaches in precise databases that propose new measures to mine correlated patterns, but they are not applicable in uncertain databases because certain and uncertain databases differ both semantically and computationally. In this paper, we propose a new strategy: Weighted Uncertain Interesting Pattern Mining (WUIPM), in which a tree structure (WUIP-tree) and several new measures (e.g., uConf, wUConf) are suggested to mine correlated patterns from uncertain databases. To our knowledge, ours is the first work specifically to consider weight or importance of an individual item alongside correlation between items of patterns in uncertain databases. Additionally, we propose a new metric, prefix proxy value, pProxy for our WUIP-tree that helps improve the mining performance. A comprehensive performance study shows that our strategy (a) generates fewer but valuable patterns and (b) is faster than existing approaches even when affinity measures are not applied.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 354, 1 August 2016, Pages 60–85
نویسندگان
, , , , ,