Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10321910 | Expert Systems with Applications | 2015 | 38 Pages |
Abstract
Information overload becomes a serious problem in the digital age. It negatively impacts understanding of useful information. How to alleviate this problem is the main concern of research on natural language processing, especially multi-document summarization. With the aim of seeking a new method to help justify the importance of similar sentences in multi-document summarizations, this study proposes a novel approach based on recent hierarchical Bayesian topic models. The proposed model incorporates the concepts of n-grams into hierarchically latent topics to capture the word dependencies that appear in the local context of a word. The quantitative and qualitative evaluation results show that this model has outperformed both hLDA and LDA in document modeling. In addition, the experimental results in practice demonstrate that our summarization system implementing this model can significantly improve the performance and make it comparable to the state-of-the-art summarization systems.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Guangbing Yang, Dunwei Wen, Kinshuk Kinshuk, Nian-Shing Chen, Erkki Sutinen,