Topics modeling based on selective Zipf distribution

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
384945	660857	2012	6 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Topic model - مدل موضوع

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Topics modeling based on selective Zipf distribution

چکیده انگلیسی

Automatically mining topics out of text corpus becomes an important fundament of many topic analysis tasks, such as opinion recognition, Web content classification, etc. Although large amount of topic models and topic mining methods have been proposed for different purposes and shown success in dealing with topic analysis tasks, it is desired to create accurate models or mining algorithms for many applications. A general criteria based on Zipf fitness quantity computation is proposed to determine whether a topic description is well-form or not. Based on the quantity definition, the popular Dirichlet prior on multinomial parameters is found that it cannot always produce well-form topic descriptions. Hence, topics modeling based on LDA with selective Zipf documents as training dataset is proposed to improve the quality in generation of topics description. Experiments on two standard text corpuses, i.e. AP dataset and Reuters-21578, show that the modeling method based on selective Zipf distribution can achieve better perplexity, which means better ability in predicting topics. While a test of topics extraction on a collection of news documents about recent financial crisis shows that the description key words in topics are more meaningful and reasonable than that of tradition topic mining method.

► A criteria is proposed to determine whether a topic is well-form or not.
► Dirichlet prior on multinomial parameters is weak in describing a well-form topic.
► Topic modeling based on LDA with selective Zipf documents is effective.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 39, Issue 7, 1 June 2012, Pages 6541–6546

نویسندگان

Jianping Zeng, Jiangjiao Duan, Wenjun Cao, Chengrong Wu,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Topics modeling based on selective Zipf distribution

دسترسی سریع

ارتباط

English Website