Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
459173 | Journal of Network and Computer Applications | 2016 | 12 Pages |
•Bit-level n-gram based forensic authorship analysis on social media.•Identifying individuals from linguistic profiles.•Extraction of authors’ linguistic features from the text of their postings.
Users interact with social media in a number of ways, providing a variety of data, from ratings and approvals to quantities of text. Public discussion for hotspots in particular generates significant volume and velocity of user-contributed text, frequently attributable to a user identifier or nom de plume. It may be feasible to determine authorship of various tracts of text on social media using n-gram analysis on the bit-level rendition of the text. This paper explores the facility of bit-level n-gram analysis with other statistical classification approaches for determining authorship on two months of captured user postings from an online news and opinion website with moderated discussion. The results show that this approach can achieve a good recognition rate with a low false negative rate.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide