کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
403452 677231 2016 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Hyper-cylindrical micro-clustering for streaming data with unscheduled data removals
ترجمه فارسی عنوان
میکرو خوشه بندی بیش از حد استوانه ای برای جریان داده ها با حذف اطلاعات برنامه ریزی نشده
کلمات کلیدی
جریان داده ها، خوشه بندی مبتنی بر تراکم، میکرو خوشه بیش از حد استوانه ای، حذف اطلاعات غیر رسمی، خوشه بندی داده ها
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

We present a streaming data clustering algorithm which allows removals of data records at any arbitrary time. Unlike other existing algorithms whose objective is to track evolving clusters by letting the weight of old data decline with time, we consider the case when all data records do not have pre-specified lifetime which is the characteristic of many real data sets such as bank accounts. A data record can be added or removed by users at any arbitrary time. The algorithm processes each datum in one-pass-throw-away fashion without storing the whole data set. A technique for merging several micro-clusters into a hyper-cylindrical micro-cluster is proposed to reduce the number of micro-clusters in feature space, and thus reduce computation. The performance of this algorithm is tested with several data sets including both synthetic and real data sets. The proposed algorithm shows better performances compared with other state-of-the-art algorithms in terms of several indices for measuring clustering performance.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 99, 1 May 2016, Pages 183–200
نویسندگان
, , ,