کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6864686 1439549 2018 34 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Latent Dirichlet mixture model
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Latent Dirichlet mixture model
چکیده انگلیسی
Text representation based on latent topic model is seen as a non-Gaussian problem where the observed words and latent topics are multinomial variables and the topic proportionals are Dirichlet variables. Traditional topic model is established by introducing a single Dirichlet prior to characterize the topic proportionals. The words in a text document are represented by a random mixture of semantic topics. However, in real world, a single Dirichlet distribution may not faithfully reflect the variations of topic proportionals estimated from the heterogeneous documents. To address these variations, we propose a new latent variable model where latent topics and their proportionals are learned by incorporating the prior based on Dirichlet mixture model. The resulting latent Dirichlet mixture model (LDMM) is constructed for topic clustering as well as document clustering. Multiple Dirichlets provide a solution to build structural latent variables in learning representation over a variety of topics. This study carries out the inference for LDMM according to the variational Bayes and the collapsed variational Bayes. Such an unsupervised LDMM is further extended to a supervised LDMM for text classification. Experiments on document representation, summarization and classification show the merit of structural prior in LDMM topic models.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 278, 22 February 2018, Pages 12-22
نویسندگان
, , ,