کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
402179 | 676872 | 2016 | 12 صفحه PDF | دانلود رایگان |
• We extract co-occurrence term relations using IdeaGraph.
• We extract semantic term relations using topic model.
• We fuse multiple types of relations to form a coupled term graph.
• We extract topics from the graph using a graph analytical approach.
Topic detection as a tool to detect topics from online media attracts much attention. Generally, a topic is characterized by a set of informative keywords/terms. Traditional approaches are usually based on various topic models, such as Latent Dirichlet Allocation (LDA). They cluster terms into a topic by mining semantic relations between terms. However, co-occurrence relations across the document are commonly neglected, which leads to the detection of incomplete information. Furthermore, the inability to discover latent co-occurrence relations via the context or other bridge terms prevents the important but rare topics from being detected.To tackle this issue, we propose a hybrid relations analysis approach to integrate semantic relations and co-occurrence relations for topic detection. Specifically, the approach fuses multiple relations into a term graph and detects topics from the graph using a graph analytical method. It can not only detect topics more effectively by combing mutually complementary relations, but also mine important rare topics by leveraging latent co-occurrence relations. Extensive experiments demonstrate the advantage of our approach over several benchmarks.
Journal: Knowledge-Based Systems - Volume 93, 1 February 2016, Pages 109–120