Article ID Journal Published Year Pages File Type
557735 Computer Speech & Language 2016 23 Pages PDF
Abstract

•We formalize the problem of MDS as the weighted hierarchical archetypal analysis problem.•The weighted hierarchical archetypal analysis shows the ability to select “the best of the best” summary sentences.•The paper demonstrates how four different summarization tasks can be treated by the proposed framework.•Experimental results show that the framework is an effective summarization method.

Multi-document summarization (MDS) is becoming a crucial task in natural language processing. MDS targets to condense the most important information from a set of documents to produce a brief summary. Most existing extractive multi-document summarization methods employ different sentence selection approaches to obtain the summary as a subset of sentences from the given document set. The ability of the weighted hierarchical archetypal analysis to select “the best of the best” summary sentences motivates us to use this method in our solution to multi-document summarization tasks. In this paper, we propose a new framework for various multi-document summarization tasks based on weighted hierarchical archetypal analysis. The paper demonstrates how four variant summarization tasks, including general, query-focused, update, and comparative summarization, can be modeled as different versions acquired from the proposed framework. Experiments on summarization data sets (DUC04-07, TAC08) are conducted to demonstrate the efficiency and effectiveness of our framework for all four kinds of the multi-document summarization tasks.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,