کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
484101 703253 2016 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Identifying Users across Different Sites Using Usernames
ترجمه فارسی عنوان
شناسایی کاربران در سایت های مختلف با استفاده از نامهای کاربری
کلمات کلیدی
شناسایی کاربر، شباهت نام کاربری، مدل اطلاعات شخصی، تشخیص اختصارات
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

Identifying users across different sites is to find the accounts that belong to the same individual. The problem is fundamental and important, and its results can benefit many applications such as social recommendation. Observing that 1) usernames are essential elements for all sites; 2) most users have limited number of usernames on the Internet; 3) usernames carries information that reflect an individual's characteristics and habits etc., this paper tries to identify users based on username similarity. Specifically, we introduce the self-information vector model to integrate our proposed content and pattern features extracted from usernames into vectors. In this paper, we define two usernames’ similarity as the cosine similarity between their self-information vectors. We further propose an abbreviation detection method to discover the initialism phenomenon in usernames, which can improve our user identification results. Experimental results on real-world username sets show that we can achieve 86.19% precision rate, 68.53% recall rate and 76.21% F1-measure in average, which is better than the state-of-the-art work.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 80, 2016, Pages 376–385
نویسندگان
, , , , ,