Article ID Journal Published Year Pages File Type
394373 Information Sciences 2013 14 Pages PDF
Abstract

Generalized itemset mining is an established data mining technique that focuses on discovering high-level correlations among large databases. By exploiting a taxonomy built over the data items, items are aggregated into higher level concepts and, thus, data correlations at different abstraction levels can be discovered. However, since a large number of patterns can be extracted, the result of the mining process is often not easily manageable by domain experts.We propose a novel approach to discovering a compact subset of generalized itemsets from structured data. To guarantee model conciseness and readability, a set of itemsets that has a common generalization is generated only when its cardinality is so small that its manual inspection is practically feasible. Furthermore, generalizations are generated only when their knowledge is covered by a large number of low-level descendant itemsets, and the generalizations are worth considering in place of their many low-level descendants only in these cases.Experiments performed on synthetic, benchmark, and real data taken from a mobile application scenario demonstrate the effectiveness and efficiency of the proposed approach.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,