کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6948314 | 1451031 | 2018 | 40 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
An abusive text detection system based on enhanced abusive and non-abusive word lists
ترجمه فارسی عنوان
یک سیستم تشخیص متن سوء استفاده بر اساس لیست های اصلاح شده سوء استفاده و غیر سوء استفاده کننده
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
کلمات خشونت آمیز، کلمات عامیانه، بدبختی قلدری سایبری، سیستم های تشخیص
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
سیستم های اطلاعاتی
چکیده انگلیسی
Abusive text (indiscriminate slang, abusive language, and profanity) on the Internet is not just a message but rather a tool for very serious and brutal cyber violence. It has become an important problem to devise a method for detecting and preventing abusive text online. However, the intentional obfuscation of words and phrases makes this task very difficult and challenging. We design a decision system that successfully detects (obfuscated) abusive text using an unsupervised learning of abusive words based on word2vec's skip-gram and the cosine similarity. The system also deploys several efficient gadgets for filtering abusive text such as blacklists, n-grams, edit-distance metrics, mixed languages, abbreviations, punctuation, and words with special characters to detect the intentional obfuscation of abusive words. We integrate both an unsupervised learning method and efficient gadgets into a single system that enhances abusive and non-abusive word lists. The integrated decision system based on the enhanced word lists shows a precision of 94.08%, a recall of 80.79%, and an f-score of 86.93% in malicious word detection for news article comments, a precision of 89.97%, a recall of 80.55%, and an f-score 85.00% for online community comments, and a precision of 90.65%, a recall of 93.57%, and an f-score 92.09% for Twitter tweets. We expect that our approach can help to improve the current abusive word detection system, which is crucial for several web-based services including social networking services and online games.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Decision Support Systems - Volume 113, September 2018, Pages 22-31
Journal: Decision Support Systems - Volume 113, September 2018, Pages 22-31
نویسندگان
Ho-Suk Lee, Hong-Rae Lee, Jun-U Park, Yo-Sub Han,