Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6873134 | Future Generation Computer Systems | 2018 | 12 Pages |
Abstract
Matching user accounts can help us build better users' profiles and benefit many applications. It has attracted much attention from both industry and academia. Most of existing works are mainly based on rich user profile attributes. However, in many cases, user profile attributes are unavailable, incomplete or unreliable, either due to the privacy settings or just because users decline to share their information. This makes the existing schemes quite fragile. Users often share their activities on different social networks. This provides an opportunity to overcome the above problem. We aim to address the problem of user identification based on User Generated Content (UGC). We first formulate the problem of user identification based on UGCs and then propose a UGC-based user identification model. A supervised machine learning based solution is presented. It has three steps: firstly, we propose several algorithms to measure the spatial similarity, temporal similarity and content similarity of two UGCs; secondly, we extract the spatial, temporal and content features to exploit these similarities; afterwards, we employ the machine learning method to match user accounts, and conduct the experiments on three ground truth datasets. The results show that the proposed method has given excellent performance with F1 values reaching 89.79%, 86.78% and 86.24% on three ground truth datasets, respectively. This work presents the possibility of matching user accounts with high accessible online data.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
Yongjun Li, Zhen Zhang, You Peng, Hongzhi Yin, Quanqing Xu,