دانلود رایگان مقاله: چقدر می توانیم خلاصه ای از متن استخراج کنیم؟ روش های اکتشافی برای دستیابی به مرزهای بالایی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4942986	1437616	2017	69 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

How far we can go with extractive text summarization? Heuristic methods to obtain near upper bounds

ترجمه فارسی عنوان

چقدر می توانیم خلاصه ای از متن استخراج کنیم؟ روش های اکتشافی برای دستیابی به مرزهای بالایی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

خلاصه متن استخراج، ساخت بدنه بالا، ساخت عصاره ایده آل، ارزیابی خلاصه،

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

چقدر می توانیم خلاصه ای از متن استخراج کنیم؟ روش های اکتشافی برای دستیابی به مرزهای بالایی

چکیده انگلیسی

Extractive text summarization is an effective way to automatically reduce a text to a summary by selecting a subset of the text. The performance of a summarization system is usually evaluated by comparing with human-constructed extractive summaries that are created in annotated text datasets. However, for datasets where an abstract is written for reader purpose, the performance of a summarization system is evaluated by comparing with an abstract that is created by human who uses his own words. This makes it difficult to determine how far the state-of-the-art extractive methods are away from the upper bound that an ideal extractive method might achieve. In addition, the performance of an extractive method is always different in each domain, which make it difficult to benchmark. Previous studies construct an ideal sentence-based extract of a document that provides the best score of a given metric by exhaustive search of all possible sentence combinations of a given length. They then use the performance of the extract as the sentence-based upper-bound. However, this only applies to short texts. For long texts and multiple documents, previous studies rely on manual effort, which is expensive and time consuming. In this paper, we propose nine fast heuristic methods to generate the near ideal sentence-based extracts for long texts and multiple documents. Furthermore, we propose an n-gram construction method to construct the word-based upper-bound. A percentage ranking method is used to benchmark different extractive methods across different corpora. In the experiments, five different corpora are used. The results show that the near upper bounds constructed by the proposed methods are close to that using exhaustive search, but the proposed methods are much faster. Six general extractive summarization methods were also assessed to demonstrate the difference between the performance of the methods and the near upper bounds.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 90, 30 December 2017, Pages 439-463

نویسندگان

W.M. Wang, Z. Li, J.W. Wang, Z.H. Zheng,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : چقدر می توانیم خلاصه ای از متن استخراج کنیم؟ روش های اکتشافی برای دستیابی به مرزهای بالایی

دسترسی سریع

ارتباط

English Website