کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4947783 1439590 2017 26 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
PSDVec: A toolbox for incremental and scalable word embedding
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
PSDVec: A toolbox for incremental and scalable word embedding
چکیده انگلیسی
PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algorithm to learn the embeddings incrementally. This strategy greatly reduces the learning time of word embeddings on a large vocabulary, and can learn the embeddings of new words without re-learning the whole vocabulary. On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing (NLP) tasks, PSDVec produces embeddings that has the best average performance among popular word embedding tools. PSDVec provides a new option for NLP practitioners.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 237, 10 May 2017, Pages 405-409
نویسندگان
, , ,