کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
489856 704634 2015 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Cluster Based Outlier Detection Algorithm for Healthcare Data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Cluster Based Outlier Detection Algorithm for Healthcare Data
چکیده انگلیسی

Outliers has been studied in a variety of domains including Big Data, High dimensional data, Uncertain data, Time Series data, Biological data, etc. In majority of the sample datasets available in the repository, atleast 10% of the data may be erroneous, missing or not available. In this paper, we utilize the concept of data preprocessing for outlier reduction. We propose two algorithms namely Distance-Based outlier detection and Cluster-Based outlier algorithm for detecting and removing outliers using a outlier score. By cleaning the dataset and clustering based on similarity, we can remove outliers on the key attribute subset rather than on the full dimensional attributes of dataset. Experiments were conducted using 3 built-in Health care dataset available in R package and the results show that the cluster-based outlier detection algorithm providing better accuracy than distance based outlier detection algorithm.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 50, 2015, Pages 209-215