کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
379325 659290 2006 25 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Online clustering of parallel data streams
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Online clustering of parallel data streams
چکیده انگلیسی

In recent years, the management and processing of so-called data streams has become a topic of active research in several fields of computer science such as, e.g., distributed systems, database systems, and data mining. A data stream can roughly be thought of as a transient, continuously increasing sequence of time-stamped data. In this paper, we consider the problem of clustering parallel streams of real-valued data, that is to say, continuously evolving time series. In other words, we are interested in grouping data streams the evolution over time of which is similar in a specific sense. In order to maintain an up-to-date clustering structure, it is necessary to analyze the incoming data in an online manner, tolerating not more than a constant time delay. For this purpose, we develop an efficient online version of the classical K-means clustering algorithm. Our method’s efficiency is mainly due to a scalable online transformation of the original data which allows for a fast computation of approximate distances between streams.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 58, Issue 2, August 2006, Pages 180–204
نویسندگان
, ,