Article ID Journal Published Year Pages File Type
484927 Procedia Computer Science 2015 8 Pages PDF
Abstract

The information contained in the document can be retrieved from its most significant paragraph, rather than by reading the whole document. The proposed work ranks the paragraphs of a text document using eigen analysis and returns the most important paragraph of a document. The importance of each paragraph is determined based on the correlation between the paragraphs. The proposed method explores the use of fuzzy graphs in capturing the inter-paragraph correlation of text documents. This approach models the document as a fuzzy graph where a node refers to a paragraph and an edge indicates the relationship between the paragraphs. The correlation between paragraphs is measured by extracting their semantic similarity. The importance of each node is determined based on this correlation. Subsequently the system ranks the paragraphs according to their importance. The proposed system is evaluated using DUC 2001 data set. The ROUGE scores show that the significant paragraph suggested by the proposed method covers relatively a good amount of relevant information in the document.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)