Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4960612 | Procedia Computer Science | 2017 | 10 Pages |
Abstract
The vector representations of words are very useful in different natural language processing tasks in order to capture the semantic meaning of words. In this context, the three known methods are: LSA, Word2Vec and GloVe. In this paper, these methods will be investigated in the field of topic segmentation for both languages Arabic and English. Moreover, Word2Vec is studied in depth by using different models and approximation algorithms. As results, we found out that LSA, Word2Vec and GloVe depend on the used language. However, Word2Vec presents the best word vector representation yet it depends on the choice of model.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)
Authors
Marwa Naili, Anja Habacha Chaibi, Henda Hajjami Ben Ghezala,