کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4943680 | 1437640 | 2016 | 18 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
An approach to the use of word embeddings in an opinion classification task
ترجمه فارسی عنوان
یک رویکرد به استفاده از تعبیر کلمه در یک کار گروه بندی نظر
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
چکیده انگلیسی
In this paper we show how a vector-based word representation obtained via word2vec can help to improve the results of a document classifier based on bags of words. Both models allow obtaining numeric representations from texts, but they do it very differently. The bag of words model can represent documents by means of widely dispersed vectors in which the indices are words or groups of words. word2vec generates word level representations building vectors that are much more compact, where indices implicitly contain information about the context of word occurrences. Bags of words are very effective for document classification and in our experiments no representation using only word2vec vectors is able to improve their results. However, this does not mean that the information provided by word2vec is not useful for the classification task. When this information is used in combination with the bags of words, the results are improved, showing its complementarity and its contribution to the task. We have also performed cross-domain experiments in which word2vec has shown much more stable behavior than bag of words models.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 66, 30 December 2016, Pages 1-6
Journal: Expert Systems with Applications - Volume 66, 30 December 2016, Pages 1-6
نویسندگان
Fernando EnrÃquez, José A. Troyano, Tomás López-Solaz,