کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4946614 1439410 2017 33 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Novel density-based and hierarchical density-based clustering algorithms for uncertain data
ترجمه فارسی عنوان
الگوریتم خوشه بندی مبتنی بر تراکم و سلسله مراتبی رمان برای داده های نامشخص
کلمات کلیدی
خوشه بندی داده های نامعلوم، الگوریتم مبتنی بر تراکم، الگوریتم مبتنی بر تراکم سلسله مراتبی،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing algorithms in accuracy and efficiency.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neural Networks - Volume 93, September 2017, Pages 240-255
نویسندگان
, , ,