Article ID Journal Published Year Pages File Type
496930 Applied Soft Computing 2011 13 Pages PDF
Abstract

The minimal frequency constraint in classical association mining algorithms turns out to be a challenging bottleneck in discovery of large number of infrequent associations that can be potential in knowledge content. A lower choice for threshold frequency not only incurs huge cost of pattern explosion but also cuts reliability of discovered knowledge. The goal of the present paper is to devise a new framework addressing two necessities. The first is discovery of confident associations unconstrained to classical minimal frequency. The second is to ensure quality of the discovered rules. We propose a new property among items, terming it cohesion, and develop cohesion-based scalable algorithms for confident association discovery. In order to assess quality of rules in terms of knowledge content, we propose two new measures, accuracy and predictability based on documented associations. Experiments with market-basket data as well as microarray data establish superiority of cohesion-based technique both in terms of amount and quality of discovered knowledge.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
,