کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
411614 679578 2016 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification
ترجمه فارسی عنوان
گسترش معنایی با استفاده از خوشه بند بستن کلمات و شبکه عصبی کانولوشن برای بهبود طبقه بندی متن کوتاه
کلمات کلیدی
متن کوتاه، طبقه بندی، خوشه بندی شبکه عصبی متقاطع، واحد های معنایی، تعبیر کلمه
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Text classification can help users to effectively handle and exploit useful information hidden in large-scale documents. However, the sparsity of data and the semantic sensitivity to context often hinder the classification performance of short texts. In order to overcome the weakness, we propose a unified framework to expand short texts based on word embedding clustering and convolutional neural network (CNN). Empirically, the semantically related words are usually close to each other in embedding spaces. Thus, we first discover semantic cliques via fast clustering. Then, by using additive composition over word embeddings from context with variable window width, the representations of multi-scale semantic units1 in short texts are computed. In embedding spaces, the restricted nearest word embeddings (NWEs)2 of the semantic units are chosen to constitute expanded matrices, where the semantic cliques are used as supervision information. Finally, for a short text, the projected matrix3 and expanded matrices are combined and fed into CNN in parallel. Experimental results on two open benchmarks validate the effectiveness of the proposed method.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 174, Part B, 22 January 2016, Pages 806–814
نویسندگان
, , , , , ,