کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
384947 | 660857 | 2012 | 5 صفحه PDF | دانلود رایگان |

Spam has become an increasingly important problem with a big economic impact in society. Spam filtering poses a special problem in text categorization, in which the defining characteristic is that filters face an active adversary, which constantly attempts to evade filtering. In this paper, we present a novel approach to spam filtering based on the minimum description length principle and confidence factors. The proposed model is fast to construct and incrementally updateable. Furthermore, we have conducted an empirical experiment using three well-known, large and public e-mail databases. The results indicate that the proposed classifier outperforms the state-of-the-art spam filters.
► We present a novel approach to spam filtering based on the MDL principle and confidence factors.
► The proposed model is fast to construct and incrementally updateable.
► We report experimental results on the largest and most realistic laboratory evaluations to date.
► The results indicate that the classifier outperforms the state-of-the-art spam filters.
Journal: Expert Systems with Applications - Volume 39, Issue 7, 1 June 2012, Pages 6557–6561