کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
485393 703325 2016 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Name Disambiguation Method Based on Multi-step Clustering
ترجمه فارسی عنوان
روش یکنواختی نام بر اساس خوشه بندی چند گام
کلمات کلیدی
ابهام نام استخراج ویژگی، شناخت معنایی، خوشه بندی سلسله مراتبی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

Author name disambiguation is a very important and complex research topic. During the retrieval and research of literatures, the quality of the investigation results has been reduced because of the high probability of different authors sharing the same name, which lengthens the whole cycle of the scientific research. Therefore, it is necessary to find a reasonable and efficient method to distinguish the different authors who share the same name. In this paper, an author name disambiguation method based on multi-step clustering (NDMC) is proposed to disambiguate author names. First, the framework combines the brief and clear characteristics of literature system information with the comparison of co-authors’ similarity to realize the initial clustering. Then, author's information is extracted from the Baidu Encyclopedia, and the semantic similarity of subordinate units is compared, as the basis of identity discrimination in the second step clustering. Finally, after extraction of two step clustering paper keywords in each class cluster, combined into corpus collection, through the characteristics of the semantic comparison, cancellation of indeterminacy results further adjustment, so as to complete the multi-step clustering. We extract literature information from the China National Knowledge Infrastructure (CNKI) to implement experiments. The experimental results show that the hybrid disambiguation framework is feasible and efficient.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 83, 2016, Pages 488–495
نویسندگان
, , , ,