کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
379208 | 659274 | 2009 | 39 صفحه PDF | دانلود رایگان |
![عکس صفحه اول مقاله: Efficient algorithms for incremental maintenance of closed sequential patterns in large databases Efficient algorithms for incremental maintenance of closed sequential patterns in large databases](/preview/png/379208.png)
Recent study shows that mining compact frequent patterns (such as closed patterns and compressed patterns) can alleviate the interpretability and efficiency problem encountered by traditional frequent pattern mining methods. Compact frequent patterns keep exact or approximate supports of a complete set of frequent patterns, and the number of them is often orders of magnitude smaller. Several efficient algorithms have been proposed to mine compact sequential patterns. However, sequence databases are not always static. Sequences (or items) are often added to and deleted from databases. A slight change made on a database may lead to the change of compact patterns. Mining from scratch is very time-consuming and thus infeasible. In this paper, we explore how to efficiently maintain closed sequential patterns in a dynamic sequence database environment. A compact structure CSTree is designed to keep closed sequential patterns, and its nice properties are carefully studied. Two efficient algorithms, IMCSA and IMCSD, are developed to maintain the CSTree upon incremental update. The algorithms make full use of the properties of CSTree to find nodes whose states are obsolete and avoid unnecessary node extension and closure checking operations to accelerate the incremental update process. A thorough experimental study on various real and synthetic datasets shows that the proposed algorithms outperform the state-of-the-art algorithms – PrefixSpan, CloSpan, BIDE and a recently proposed incremental mining algorithm IncSpan by about a factor of 4 to more than an order of magnitude.
Journal: Data & Knowledge Engineering - Volume 68, Issue 1, January 2009, Pages 68–106