کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
535142 | 870324 | 2008 | 10 صفحه PDF | دانلود رایگان |
![عکس صفحه اول مقاله: Empirical likelihood confidence intervals for differences between two datasets with missing data Empirical likelihood confidence intervals for differences between two datasets with missing data](/preview/png/535142.png)
Detecting differences between populations (or datasets) is an important research topic in machine learning, yet an common application means of evaluating, such as a new medical product by comparing with an old one. Previous researchers focus on change detection. In this paper, we measure the uncertainty of structural differences, such as mean and distribution function differences, between populations, using a confidence interval (CI), via an empirical likelihood approach. We present a statistically sound method for estimating CIs for differences between non-parametric populations with missing values, which are imputed by using simple random hot deck imputation method. We illustrate the power of CI estimation as a new machine learning technique for, such as, distinguishing spam from non-spam emails in spambase dataset downloaded from UCI.
Journal: Pattern Recognition Letters - Volume 29, Issue 6, 15 April 2008, Pages 803–812