Article ID Journal Published Year Pages File Type
725093 The Journal of China Universities of Posts and Telecommunications 2012 8 Pages PDF
Abstract

With the development of web 2.0, users are becoming more and more deeply involved in Internet, not only as readers, but also as authors. Wording preference is a well-known phenomenon that different people probably use different words even when they talk about the same topic. We think this phenomenon has a great impact on modeling texts by different authors, especially on topic modeling. This paper proposes a way to model user's preference by Dirichlet process (DP) in a topic model frame. Experiments show that our model outperforms the hierarchical Dirichlet process mixture model (DPMM) on a corpus of social tagging data from del.icio.us. Combination of user's preference can not only bring better performance on normal topic modeling task, but also discover the user's preference.

Related Topics
Physical Sciences and Engineering Engineering Electrical and Electronic Engineering