کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
558290 | 874892 | 2014 | 13 صفحه PDF | دانلود رایگان |
• High-level features for cyberpedophilia detection are proposed.
• The fixated discourse model is suggested.
• Experiments on distinguishing between pedophiles’ and non-pedophiles’ chats are performed.
• Feature analysis is presented.
In this paper, we suggest a list of high-level features and study their applicability in detection of cyberpedophiles. We used a corpus of chats downloaded from http://www.perverted-justice.com and two negative datasets of different nature: cybersex logs available online, and the NPS chat corpus. The classification results show that the NPS data and the pedophiles’ conversations can be accurately discriminated from each other with character n-grams, while in the more complicated case of cybersex logs there is need for high-level features to reach good accuracy levels. In this latter setting our results show that features that model behaviour and emotion significantly outperform the low-level ones, and achieve a 97% accuracy.
Journal: Computer Speech & Language - Volume 28, Issue 1, January 2014, Pages 108–120