Reducing explicit semantic representation vectors using Latent Dirichlet Allocation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
403430	677227	2016	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Topic modeling - مدل سازی موضوع Semantic representation - نمایندگی معنایی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Reducing explicit semantic representation vectors using Latent Dirichlet Allocation

چکیده انگلیسی

Explicit Semantic Analysis (ESA) is a knowledge-based method which builds the semantic representation of the words depending on the textual description of the concepts in the certain knowledge source. Due to its simplicity and success, ESA has received wide attention from researchers in the computational linguistics and information retrieval. However, the representation vectors formed by ESA method are generally very excessive, high dimensional, and may contain many redundant concepts. In this paper, we introduce a reduced semantic representation method that constructs the semantic interpretation of the words as the vectors over the latent topics from the original ESA representation vectors. For modeling the latent topics, the Latent Dirichlet Allocation (LDA) is adapted to the ESA vectors for extracting the topics as the probability distributions over the concepts rather than the words in the traditional model. The proposed method is applied to the wide knowledge sources used in the computational semantic analysis: WordNet and Wikipedia. For evaluation, we use the proposed method in two natural language processing tasks: measuring the semantic relatedness between words/texts and text clustering. The experimental results indicate that the proposed method overcomes the limitations of the representation of the ESA method.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 100, 15 May 2016, Pages 145–159

نویسندگان

Abdulgabbar Saif, Mohd Juzaiddin Ab Aziz, Nazlia Omar,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Reducing explicit semantic representation vectors using Latent Dirichlet Allocation

دسترسی سریع

ارتباط

English Website