کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
536554 870558 2010 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Mining outliers with faster cutoff update and space utilization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Mining outliers with faster cutoff update and space utilization
چکیده انگلیسی

It is desirable to find unusual data objects by Ramaswamy et al.’s distance-based outlier definition, because only a metric distance function between two objects is required. This definition does not need any neighborhood distance threshold required by many existing algorithms based on the definition of Knorr and Ng. Bay and Schwabacher proposed an efficient algorithm ORCA, which can give near linear time performance, for this task. To further reduce the running time, we propose in this paper two algorithms RC and RS using the following two techniques, respectively: (i) faster cutoff update, and (ii) space utilization after pruning. We tested RC, RS, and RCS (a hybrid approach combining both RC and RS) on several large and high-dimensional real data sets with millions of objects. The experiments show that the speed of RCS is as fast as 1.4–2.3 times that of ORCA, and the improvement of RCS is relatively insensitive to the increase in the data size.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 31, Issue 11, 1 August 2010, Pages 1292–1301
نویسندگان
, ,