کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
393417 665650 2013 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Examination and comparison of conflicting data in granulated datasets: Equal width interval vs. equal frequency interval
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Examination and comparison of conflicting data in granulated datasets: Equal width interval vs. equal frequency interval
چکیده انگلیسی

Knowledge discovery from databases requires comprehensive pre-examination to ensure that granulated datasets are consistent for continuous database conversion. Different granulation techniques may produce different results in the number of conflicting data in a granulated dataset. This work examines and compares the performance of equal width interval (EWI) and equal frequency interval (EFI), two granulation techniques. This work also explores the relationship between granulation performance and dataset size, number of attributes, and number of classes. Eighteen continuous datasets are examined. Experimental results indicate that (1) of the 18 datasets examined, 7 contained conflicting data by EWI and 8 by EFI, suggesting that almost 40% of the granulated datasets contained conflicting data; (2) almost 22% of the datasets had more than 20% conflicting data; (3) comparatively, no notable difference existed between EWI and EFI with respect to their granulation performance; (4) the production of conflicting data by EWI and EFI when compared against dataset size and number of classes was not remarkably different; and (5) more than 12 attributes will reduce the number of conflicting data by both EWI and EFI.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 239, 1 August 2013, Pages 154–164
نویسندگان
, , ,