کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
10355256 | 867123 | 2005 | 21 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Text summarization using a trainable summarizer and latent semantic analysis
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
This paper proposes two approaches to address text summarization: modified corpus-based approach (MCBA) and LSA-based T.R.M. approach (LSAÂ +Â T.R.M.). The first is a trainable summarizer, which takes into account several features, including position, positive keyword, negative keyword, centrality, and the resemblance to the title, to generate summaries. Two new ideas are exploited: (1) sentence positions are ranked to emphasize the significances of different sentence positions, and (2) the score function is trained by the genetic algorithm (GA) to obtain a suitable combination of feature weights. The second uses latent semantic analysis (LSA) to derive the semantic matrix of a document or a corpus and uses semantic sentence representation to construct a semantic text relationship map. We evaluate LSAÂ +Â T.R.M. both with single documents and at the corpus level to investigate the competence of LSA in text summarization. The two novel approaches were measured at several compression rates on a data corpus composed of 100 political articles. When the compression rate was 30%, an average f-measure of 49% for MCBA, 52% for MCBAÂ +Â GA, 44% and 40% for LSAÂ +Â T.R.M. in single-document and corpus level were achieved respectively.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 41, Issue 1, January 2005, Pages 75-95
Journal: Information Processing & Management - Volume 41, Issue 1, January 2005, Pages 75-95
نویسندگان
Jen-Yuan Yeh, Hao-Ren Ke, Wei-Pang Yang, I-Heng Meng,