Combating Web spam through trust–distrust propagation with confidence

Article ID	Journal	Published Year	Pages	File Type
533953	Pattern Recognition Letters	2013	8 Pages	PDF

Abstract

•We propose the GBR algorithm that propagates both trust and distrust.•The propagation of a page’s trust/distrust is decided by its probability of being trust/distrust.•GBR takes advantages from both trust and distrust propagation.•GBR outperforms other typical algorithms for both spam demotion and spam detection.

Semi-automatic anti-spam algorithms propagate either trust through links from a set of good seed pages or distrust through inverse-links from a set of bad seed pages to the entire Web. It has been mentioned that a combined usage of both trust and distrust propagations can lead to better results. However, little work has been known to realize this insight successfully. In this paper, we view that each Web page has both a trustworthy side and an untrustworthy side, and propose to assign two scores for each Web page to denote its trustworthy side and untrustworthy side, respectively. We then propose the Good-Bad Rank (GBR) algorithm for propagating trust and distrust simultaneously from both directions. In GBR, the propagation of a page’s trust/distrust is decided by its probability of being trust/distrust. GBR takes advantages from both trust and distrust propagations, thus is more powerful than propagating only trust or distrust. Experimental results show that GBR outperforms other typical link-based anti-spam algorithms that propagates only trust or distrust. GBR achieves comparable performance than another algorithm that propagates both trust and distrust, TDR, but is much more efficient than TDR.

Keywords

Trust propagation Web spam