PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children

Article ID	Journal	Published Year	Pages	File Type
382408	Expert Systems with Applications	2015	9 Pages	PDF

Abstract

•In this paper, we propose a algorithm named PrePost+ for frequent itemset mining.•PrePost+ employs N-lists to represent itemsets and Children–Parent Equivalence pruning to narrow search space.•Experiment results on real datasets show that PrePost+ is effective and outperforms state-of-the-art algorithms.

N-list is a novel data structure proposed in recent years. It has been proven to be very efficient for mining frequent itemsets. In this paper, we present PrePost+, a high-performance algorithm for mining frequent itemsets. It employs N-list to represent itemsets and directly discovers frequent itemsets using a set-enumeration search tree. Especially, it employs an efficient pruning strategy named Children–Parent Equivalence pruning to greatly reduce the search space. We have conducted extensive experiments to evaluate PrePost+ against three state-of-the-art algorithms, which are PrePost, FIN, and FP-growth∗, on six various real datasets. The experimental results show that PrePost+ is always the fastest one on all datasets. Moreover, PrePost+ also demonstrates good performance in terms of memory consumption since it use only a litter more memory than FP-growth∗ and less memory than PrePost and FIN.

Keywords

Frequent itemset mining Algorithm Data mining Pruning