Article ID Journal Published Year Pages File Type
515744 Information Processing & Management 2008 11 Pages PDF
Abstract

We present two approaches to email thread summarization: collective message summarization (CMS) applies a multi-document summarization approach, while individual message summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron email collection – a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,