Article ID Journal Published Year Pages File Type
391899 Information Sciences 2013 14 Pages PDF
Abstract

Graph-based summarization entails extracting a worthwhile subset of sentences from a collection of textual documents by using a graph-based model to represent the correlations between pairs of document terms. However, since the high-order correlations among multiple terms are disregarded during graph evaluation, the summarization performance could be limited unless integrating ad hoc language-dependent or semantics-based analysis.This paper presents a novel and general-purpose graph-based summarizer, namely GraphSum (Graph-based Summarizer). It discovers and exploits association rules to represent the correlations among multiple terms that have been neglected by previous approaches. The graph nodes, which represent combinations of two or more terms, are first ranked by means of a PageRank strategy that discriminates between positive and negative term correlations. Then, the produced node ranking is used to drive the sentence selection process.The experiments performed on benchmark and real-life documents demonstrate the effectiveness of the proposed approach compared to many state-of-the-art summarizers.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,