کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
461133 696562 2013 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports
چکیده انگلیسی

Sequential pattern mining (SPM) is an important technique for determining time-related behavior in sequence databases. In real-life applications, the frequencies for various items in a sequence database are not exactly equal. If all items are set with the same minimum support, the rare item problem may result, meaning that we are unable to effectively retrieve interesting patterns regardless of whether minsup is set too high or too low. Liu (2006) first included the concept of multiple minimum supports (MMSs) to SPM. It allows users to specify the minimum item support (MIS) for each item according to its natural frequency. A generalized sequential pattern-based algorithm, named Multiple Supports – Generalized Sequential Pattern (MS-GSP), was also developed to mine complete set of sequential patterns. However, the MS-GSP adopts candidate generate-and-test approach, which has been recognized as a costly and time-consuming method in pattern discovery. For the efficient mining of sequential patterns with MMSs, this study first proposes a compact data structure, called a Preorder Linked Multiple Supports tree (PLMS-tree), to store and compress the entire sequence database. Based on a PLMS-tree, we develop an efficient algorithm, Multiple Supports – Conditional Pattern growth (MSCP-growth), to discover the complete set of patterns. The experimental result shows that the proposed approach achieves more preferable findings than the MS-GSP and the conventional SPM.


► Multiple minimum supports can prune the search space in sequential pattern mining (SPM).
► An efficient tree-based method is proposed for SPM with multiple minimum supports.
► The algorithm is evaluated by both synthetic and real-life datasets.
► Experimental results show our method is more efficient than traditional methods.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 86, Issue 5, May 2013, Pages 1224–1238
نویسندگان
, , ,