کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379164 659271 2009 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficiently tracing clusters over high-dimensional on-line data streams
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Efficiently tracing clusters over high-dimensional on-line data streams
چکیده انگلیسی

A good clustering method should provide flexible scalability on the number of dimensions as well as the size of a data set. This paper proposes a method of efficiently tracing the clusters of a high-dimensional on-line data stream. While tracing the one-dimensional clusters of each dimension independently, a technique which is similar to frequent itemset mining is employed to find the set of multi-dimensional clusters. By finding a frequently co-occurred set of one-dimensional clusters, it is possible to trace a multi-dimensional rectangular space whose range is defined by the one-dimensional clusters collectively. In order to trace such candidates over a multi-dimensional online data stream, a cluster-statistics tree (CS-Tree) is proposed in this paper. A k-depth node(k ⩽ d) in the CS-tree is corresponding to a k-dimensional rectangular space. Each node keeps track of the density of data elements in its corresponding rectangular space. Only a node corresponding to a dense rectangular space is allowed to have a child node. The scalability on the number of dimensions is greatly enhanced while sacrificing the accuracy of identified clusters slightly.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 68, Issue 3, March 2009, Pages 362–379
نویسندگان
, , ,