Text Normalization in Social Media: Progress, Problems and Applications for a Pre-Processing System of Casual English

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
1123291	1488532	2011	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

موضوعات مرتبط

علوم انسانی و اجتماعی علوم انسانی و هنر هنر و علوم انسانی (عمومی)

پیش نمایش صفحه اول مقاله

Text Normalization in Social Media: Progress, Problems and Applications for a Pre-Processing System of Casual English

چکیده انگلیسی

The rapid expansion in user-generated content on the Web of the 2000s, characterized by social media, has led to Web content featuring somewhat less standardized language than the Web of the 1990s. User creativity and individuality of language creates problems on two levels. The first is that social media text is often unsuitable as data for Natural Language Processing tasks such as Machine Translation, Information Retrieval and Opinion Mining, due to the irregularity of the language featured. The second is that non-native speakers of English, older Internet users and non-members of the “in-group” often find such texts difficult to understand. This paper discusses problems involved in automatically normalizing social media English, various applications for its use, and our progress thus far in a rule-based approach to the issue. Particularly, we evaluate the performance of two leading open source spell checkers on data taken from the microblogging service Twitter, and measure the extent to which their accuracy is improved by pre-processing with our system. We also present our database rules and classification system, results of evaluation experiments, and plans for expansion of the project.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia - Social and Behavioral Sciences - Volume 27, 2011, Pages 2-11

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Text Normalization in Social Media: Progress, Problems and Applications for a Pre-Processing System of Casual English

دسترسی سریع

ارتباط

English Website