کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
483900 702868 2014 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization
ترجمه فارسی عنوان
حداقل انحراف و حداکثر ارتباط برای خلاصه متن متنی عربی یک و چند سند
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

Automatic text summarization aims to produce summaries for one or more texts using machine techniques. In this paper, we propose a novel statistical summarization system for Arabic texts. Our system uses a clustering algorithm and an adapted discriminant analysis method: mRMR (minimum redundancy and maximum relevance) to score terms. Through mRMR analysis, terms are ranked according to their discriminant and coverage power. Second, we propose a novel sentence extraction algorithm which selects sentences with top ranked terms and maximum diversity. Our system uses minimal language-dependant processing: sentence splitting, tokenization and root extraction. Experimental results on EASC and TAC 2011 MultiLingual datasets showed that our proposed approach is competitive to the state of the art systems.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of King Saud University - Computer and Information Sciences - Volume 26, Issue 4, December 2014, Pages 450–461
نویسندگان
, , ,