Article ID Journal Published Year Pages File Type
6861623 Knowledge-Based Systems 2018 28 Pages PDF
Abstract
A High Utility Itemset (HUI) mining is an important problem in the data mining literature that considers utilities of items (such as profits and margins) to discover interesting patterns from transactional databases. Several data structures, pruning strategies and algorithms have been proposed in the literature to efficiently mine high utility itemsets. Most of these works, however, do not consider itemsets with negative unit profits that provide greater flexibility to a decision maker to determine profitable itemsets. This paper aims to advance the state-of-the-art and presents a generalized high utility mining (GHUM) method that considers both positive and negative unit profits. The proposed method uses a simplified utility-list data structure for storing itemset information during the mining process. The paper also introduces a novel utility based anti-monotonic property to improve the performance of HUI mining. Furthermore, GHUM adapts key pruning strategies from the basic HUI mining literature and presents new pruning strategies to significantly improve the performance of mining. The proposed method is evaluated on a set of benchmark sparse and dense datasets and compared against a state-of-the-art method. Rigorous experimental evaluation is performed and implications of the key findings are also presented. In general, GHUM was found to deliver more than an order of magnitude improvement at a fraction of the memory over the state-of-the-art FHN method.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,