کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
514951 | 866917 | 2016 | 12 صفحه PDF | دانلود رایگان |
• We propose a novel hybrid method to capture group relation of sentences.
• We cluster sentences with a KL-divergence based on word-topic distribution.
• We proposed a vertex reinforcement random walk process in a hypergraph model.
• The process simultaneously consider the query similarity, the centrality and the diversity of sentences.
• We implement our framework and verify improvement over appropriate baselines.
General graph random walk has been successfully applied in multi-document summarization, but it has some limitations to process documents by this way. In this paper, we propose a novel hypergraph based vertex-reinforced random walk framework for multi-document summarization. The framework first exploits the Hierarchical Dirichlet Process (HDP) topic model to learn a word-topic probability distribution in sentences. Then the hypergraph is used to capture both cluster relationship based on the word-topic probability distribution and pairwise similarity among sentences. Finally, a time-variant random walk algorithm for hypergraphs is developed to rank sentences which ensures sentence diversity by vertex-reinforcement in summaries. Experimental results on the public available dataset demonstrate the effectiveness of our framework.
Journal: Information Processing & Management - Volume 52, Issue 4, July 2016, Pages 670–681