کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515649 867059 2012 21 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Updating broken web links: An automatic recommendation system
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Updating broken web links: An automatic recommendation system
چکیده انگلیسی

Broken hypertext links are a frequent problem in the Web. Sometimes the page which a link points to has disappeared forever, but in many other cases the page has simply been moved to another location in the same web site or to another one. In some cases the page besides being moved, is updated, becoming a bit different to the original one but rather similar. In all these cases it can be very useful to have a tool that provides us with pages highly related to the broken link, since we could select the most appropriate one. The relationship between the broken link and its possible linkable pages, can be defined as a function of many factors. In this work we have employed several resources both in the context of the link and in the Web to look for pages related to a broken link. From the resources in the context of a link, we have analyzed several sources of information such as the anchor text, the text surrounding the anchor, the URL and the page containing the link. We have also extracted information about a link from the Web infrastructure such as search engines, Internet archives and social tagging systems. We have combined all of these resources to design a system that recommends pages that can be used to recover the broken link. A novel methodology is presented to evaluate the system without resorting to user judgments, thus increasing the objectivity of the results, and helping to adjust the parameters of the algorithm. We have also compiled a web page collection with true broken links, which has been used to test the full system by humans.Results show that the system is able to recommend the correct page among the first ten results when the page has been moved, and to recommend highly related pages when the original one has disappeared.


► Several resources were used in the context of a link and in the Web infrastructure.
► The system recommends pages that can be used to recover any broken link.
► A methodology is proposed to evaluate the system without resorting to user judgments.
► The correct page is provided in top ten hits when the missing page has been moved.
► Highly related pages are recommended when the original one has disappeared.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 48, Issue 2, March 2012, Pages 183–203
نویسندگان
, ,