کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
552231 | 873190 | 2012 | 12 صفحه PDF | دانلود رایگان |
Page quality estimation is one of the greatest challenges for Web search engines. Hyperlink analysis algorithms such as PageRank and TrustRank are usually adopted for this task. However, low quality, unreliable and even spam data in the Web hyperlink graph makes it increasingly difficult to estimate page quality effectively. Analyzing large-scale user browsing behavior logs, we found that a more reliable Web graph can be constructed by incorporating browsing behavior information. The experimental results show that hyperlink graphs constructed with the proposed methods are much smaller in size than the original graph. In addition, algorithms based on the proposed “surfing with prior knowledge” model obtain better estimation results with these graphs for both high quality page and spam page identification tasks. Hyperlink graphs constructed with the proposed methods evaluate Web page quality more precisely and with less computational effort.
► Constructing a novel kind of Web graph with user browsing information.
► PageRank on user browsing graph outperforms the original Web graph.
► User browsing graph shares similar characteristics with Web graph.
► Surfing with prior knowledge outperforms random walk.
Journal: Decision Support Systems - Volume 54, Issue 1, December 2012, Pages 390–401