کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535142 870324 2008 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Empirical likelihood confidence intervals for differences between two datasets with missing data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Empirical likelihood confidence intervals for differences between two datasets with missing data
چکیده انگلیسی

Detecting differences between populations (or datasets) is an important research topic in machine learning, yet an common application means of evaluating, such as a new medical product by comparing with an old one. Previous researchers focus on change detection. In this paper, we measure the uncertainty of structural differences, such as mean and distribution function differences, between populations, using a confidence interval (CI), via an empirical likelihood approach. We present a statistically sound method for estimating CIs for differences between non-parametric populations with missing values, which are imputed by using simple random hot deck imputation method. We illustrate the power of CI estimation as a new machine learning technique for, such as, distinguishing spam from non-spam emails in spambase dataset downloaded from UCI.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 29, Issue 6, 15 April 2008, Pages 803–812
نویسندگان
, ,