کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
386811 | 660891 | 2014 | 11 صفحه PDF | دانلود رایگان |
• A novel approach to discovering valuable knowledge from large datasets is proposed.
• We compare data correlations at different abstraction levels.
• An algorithm to extract misleading high-level data correlations is presented.
• Itemsets representing contrasting data correlations are discovered.
• The effectiveness of the proposed approach has been evaluated on real mobile data.
Frequent generalized itemset mining is a data mining technique utilized to discover a high-level view of interesting knowledge hidden in the analyzed data. By exploiting a taxonomy, patterns are usually extracted at any level of abstraction. However, some misleading high-level patterns could be included in the mined set.This paper proposes a novel generalized itemset type, namely the Misleading Generalized Itemset (MGI). Each MGI, denoted as X▷EX▷E, represents a frequent generalized itemset X and its set EE of low-level frequent descendants for which the correlation type is in contrast to the one of X. To allow experts to analyze the misleading high-level data correlations separately and exploit such knowledge by making different decisions, MGIs are extracted only if the low-level descendant itemsets that represent contrasting correlations cover almost the same portion of data as the high-level (misleading) ancestor. An algorithm to mine MGIs at the top of traditional generalized itemsets is also proposed.The experiments performed on both real and synthetic datasets demonstrate the effectiveness and efficiency of the proposed approach.
Journal: Expert Systems with Applications - Volume 41, Issue 4, Part 1, March 2014, Pages 1400–1410