کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6862126 1439264 2017 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Minmax Circular Sector Arc for External Plagiarism's Heuristic Retrieval stage
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Minmax Circular Sector Arc for External Plagiarism's Heuristic Retrieval stage
چکیده انگلیسی
Heuristic Retrieval (HR) task aims to retrieve a set of documents from which the External Plagiarism detection identifies plagiarized pieces of text. In this context, we present Minmax Circular Sector Arcs (MinmaxCSA) algorithms that treats HR task as an approximate k-nearest neighbor search problem. Moreover, MinmaxCSA algorithms aim to retrieve the set of documents with greater amounts of plagiarized fragments, while reducing the amount of time to accomplish the HR task. Our theoretical framework is based on two aspects: (i) a triangular property to encode a range of sketches on a unique value; and (ii) a Circular Sector Arc property which enables (i) to be more accurate. Both properties were proposed for handling high-dimensional spaces, hashing them to a lower number of hash values. Our two MinmaxCSA methods, Minmax Circular Sector Arcs Lower Bound (CSAL) and Minmax Circular Sector Arcs Full Bound (CSA), achieved Recall levels slightly more imprecise than Minmaxwise hashing in exchange for a better Speedup in document indexing and query extraction and retrieval time in high-dimensional plagiarism-related datasets.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 137, 1 December 2017, Pages 1-18
نویسندگان
, , ,