Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
11002558 | Computers & Security | 2018 | 19 Pages |
Abstract
“Wikipedia”, known as the world's largest online free encyclopedia, is one of the remarkable examples of crowdsourcing, where millions of articles have been produced by volunteers from all over the world. Wikipedia allows anyone to edit articles without being pre-screened or authorized. A user can edit articles using either a valid ID or an IP address. The freedom of editing using ID or IP makes editorial identities ambiguous. This may affect the integrity of the research process. It also facilitates malicious users to vandalize Wikipedia content. Disambiguation of users' identity in Wikipedia can assist in distinguishing between trusted and mischievous users and more important in defining authorship in a less ambiguous manner. The present paper introduces a new methodology to ascertain Wikipedia authorship and to reduce ambiguity of user IDs. Our methodology uses the editing activity of users as a distinguishing feature for identifying non-ambiguous profiles. Reducing ambiguity of authorship can facilitate understanding of human behavior in collaborative editing, predicting sock puppetry (duplicate accounts), detecting anomaly, identifying trustworthy as well as offensive authors, and in improving security procedures and research in online social media. Our experimental results indicate that it is possible to disambiguate with high degree of certainty the editorial activity by 75% at rank 1.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Networks and Communications
Authors
Padma Polash Paul, Madeena Sultana, Sorin Adam Matei, Marina Gavrilova,