Article ID Journal Published Year Pages File Type
484832 Procedia Computer Science 2015 8 Pages PDF
Abstract

This paper compares Suppport Vector Machine (SVM) classification and a number of clustering approaches to separate human from not human users in Twitter in order to identify normal human activity. These approaches have similar F1 accuracy scores of 90% with both experienc- ing difficulties in classifying human users behaving abnormally. A second stage classification step was then used to further separate not human users into brands, celebrities and promoters / information achieving an average F1 accuracy of 74%. These accuracies were achieved by reducing the size of the feature space using stepwise feature selection and category balancing from manual inspection of classification results.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)