کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4943337 1437620 2017 50 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Extractive multi-document summarization using population-based multicriteria optimization
ترجمه فارسی عنوان
خلاصه سازی استخراج چند سند با استفاده از بهینه سازی چند متغیرهای مبتنی بر جمعیت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Multi-document summarization is the process of extracting salient information from a set of source texts and present that information to the user in a condensed form. In this paper, we propose a multi-document summarization system which generates an extractive generic summary with maximum relevance and minimum redundancy by representing each sentence of the input document as a vector of words in Proper Noun, Noun, Verb and Adjective set. Five features, such as TF_ISF, Aggregate Cross Sentence Similarity, Title Similarity, Proper Noun and Sentence Length associated with the sentences, are extracted, and scores are assigned to sentences based on these features. Weights that can be assigned to different features may vary depending upon the nature of the document, and it is hard to discover the most appropriate weight for each feature, and this makes generation of a good summary a very tough task without human intelligence. Multi-document summarization problem is having large number of decision parameters and number of possible solutions from which most optimal summary is to be generated. Summary generated may not guarantee the essential quality and may be far from the ideal human generated summary. To address this issue, we propose a population-based multicriteria optimization method with multiple objective functions. Three objective functions are selected to determine an optimal summary, with maximum relevance, diversity, and novelty, from a global population of summaries by considering both the statistical and semantic aspects of the documents. Semantic aspects are considered by Latent Semantic Analysis (LSA) and Non Negative Matrix Factorization (NMF) techniques. Experiments have been performed on DUC 2002, DUC 2004 and DUC 2006 datasets using ROUGE tool kit. Experimental results show that our system outperforms the state of the art works in terms of Recall and Precision.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 86, 15 November 2017, Pages 385-397
نویسندگان
, , ,