کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4961490 1446512 2017 5 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Detecting Near-duplicates in Russian Documents through Using Fingerprint Algorithm Simhash
ترجمه فارسی عنوان
تشخیص تقریبا تکراری در اسناد روسی با استفاده از الگوریتم اثر انگشت سمیهاش
کلمات کلیدی
سرقت ادبی، الگوریتم اثر انگشت سیمهاش،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

Plagiarism is one of the major problems in the age of communication. In many languages such as English, this issue is seriously of high importance and many powerful devices have been invented to prevent this problem from occurring. This article aims at discovering plagiarism in Russian texts based on fingerprint algorithm. The fingerprint algorithms have high speeds in finding out the plagiarism due to the compact features it creates and purely because of the comparison of these properties between original documents and dubious documents. Increasing the power and accuracy of plagiarism discovery, there must be elimination of general words and word rooting before pre-processing applications such as words separation, numbers replacement, and homogenization. In this article, four Simhash algorithms have been used. The implementation of these algorithms confirmed on 800 articles with the scientific topics was found to have satisfactory results.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 103, 2017, Pages 421-425
نویسندگان
, ,