کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515459 867018 2011 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Construction of a large-scale test set for author disambiguation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Construction of a large-scale test set for author disambiguation
چکیده انگلیسی

Author disambiguation resolves same-name author occurrences in the bibliographic data into namesakes. This enables author-centered searches and high-quality social network analysis. As an attempt to promote much research in author disambiguation, KISTI have constructed a new large-scale test set for this field. This article describes its semi-manual creation procedures, characteristics especially in terms of author ambiguities and name diversities. In addition, the baseline performance of author clustering against the test set is provided.

Research highlights
► In order to overcome the weaknesses of the previous test sets and to foster much research in the area of author disambiguation, the construction of a new large-scale test set was attempted.
► Among 6-stage test set construction procedures, Step-4 has contributed in reducing the construction time by automatically acquiring Web evidences to resolve name occurrences to persons.
► The new test set shows more diversities in author ambiguity, sizes of same-name groups, and non-English names than previous test sets.
► Experiments on the new test set indicates that the complexity of the author resolution problem is relatively more dependent on author ambiguity than the size for the same-name author instances.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 47, Issue 3, May 2011, Pages 452–465
نویسندگان
, , , , ,