کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
404707 677443 2016 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies
ترجمه فارسی عنوان
یک الگوریتم کارآمد برای استخراج مقادیر بالا ابزار بالا، با استفاده از آستانه جدید و استراتژی های هرس کردن
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Top-k high utility itemset mining is the process of discovering the k itemsets having the highest utilities in a transactional database. In recent years, several algorithms have been proposed for this task. However, it remains very expensive both in terms of runtime and memory consumption. The reason is that current algorithms often generate a huge amount of candidate itemsets and are unable to prune the search space effectively. In this paper, we address this issue by proposing a novel algorithm named kHMC to discover the top-k high utility itemsets more efficiently. Unlike several algorithms for top-k high utility itemset mining, kHMC discovers high utility itemsets using a single phase. Furthermore, it employs three strategies named RIU, CUD, and COV to raise its internal minimum utility threshold effectively, and thus reduce the search space. The COV strategy introduces a novel concept of coverage. The concept of coverage can be employed to prune the search space in high utility itemset mining, or to raise the threshold in top-k high utility itemset mining, as proposed in this paper. Furthermore, kHMC relies on a novel co-occurrence pruning technique named EUCPT to avoid performing costly join operations for calculating the utilities of itemsets. Moreover, a novel pruning strategy named TEP is proposed for reducing the search space. To evaluate the performance of the proposed algorithm, extensive experiments have been conducted on six datasets having various characteristics. Results show that the proposed algorithm outperforms the state-of-the-art TKO and REPT algorithms for top-k high utility itemset mining both in terms of memory consumption and runtime.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 104, 15 July 2016, Pages 106–122
نویسندگان
, , , ,