کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
379252 | 659281 | 2007 | 17 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A cost-effective method for detecting web site replicas on search engine databases
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
پیش نمایش صفحه اول مقاله
![عکس صفحه اول مقاله: A cost-effective method for detecting web site replicas on search engine databases A cost-effective method for detecting web site replicas on search engine databases](/preview/png/379252.png)
چکیده انگلیسی
Identifying replicated sites is an important task for search engines. It can reduce data storage costs, improve query processing time and remove noise that might affect the quality of the final answers given to the user. This paper introduces a new approach to detect web sites that are likely to be replicas in a search engine database. Our method uses the websites’ structure and the content of their pages to identify possible replicas. As we show through experiments, such a combination improves the precision and reduces the overall costs related to the replica detection task. Our method achieves a quality improvement of 47.23% when compared to previously proposed approaches.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volume 62, Issue 3, September 2007, Pages 421–437
Journal: Data & Knowledge Engineering - Volume 62, Issue 3, September 2007, Pages 421–437
نویسندگان
André Luiz da Costa Carvalho, Edleno Silva de Moura, Altigran Soares da Silva, Klessius Berlt, Allan Bezerra,