کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4966502 1365125 2017 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Rapid detection of similar peer-reviewed scientific papers via constant number of randomized fingerprints
ترجمه فارسی عنوان
تشخیص سریع مقالات علمی مشابه توسط پژوهشگران با استفاده از تعداد ثابت اثر انگشت تصادفی
کلمات کلیدی
اثر انگشت روش های اکتشافی، تشخیص سرقت ادبی، مقالات علمی مشابه بررسی شده توسط مجله
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی
This research is concerned with the detection of similar academic papers. Given a tested paper from a given corpus of 10,099 peer-reviewed scientific papers, a two-stage process was activated. During the first stage, most of the papers were filtered out using a fast filter method. In the second stage, in order to detect similar papers we applied 23 heuristic variants derived from 3 novel prototype methods using various parameter settings. The three novel prototype methods are: CT-TR - Constant Number of randomized T fingerprints, compared to each one-third of R (first/middle/last) fingerprints, CT-AR: Constant Number of randomized T fingerprints, compared to all R fingerprints, and CDT-AR: Constant Number of divided randomized T fingerprints compared, to all R fingerprints. Results achieved by the new methods are superior to those of previous heuristic methods, which were approximations of the “Full Fingerprint” (FF) method, currently considered the best heuristic method. The order of this new methods' run-time, Θ(n), is far more efficient than the order of the FF method run-time, Θ(n2) (after removing short documents from the corpus).
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 53, Issue 1, January 2017, Pages 70-86
نویسندگان
, ,