کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
388649 660935 2010 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Content-based hierarchical document organization using multi-layer hybrid network and tree-structured features
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Content-based hierarchical document organization using multi-layer hybrid network and tree-structured features
چکیده انگلیسی

Automatic organizing documents through a hierarchical tree is demanding in many real applications. In this work, we focus on the problem of content-based document organization through a hierarchical tree which can be viewed as a classification problem. We proposed a new document representation to enhance the classification accuracy. We developed a new hybrid neural network model to handle the new document representation. In our document representation, a document is represented by a tree-structure that has a superior capability of encoding document characteristics. Compared to traditional feature representation that encodes only global characteristics of a document, the proposed approach can encode both global and local characteristics of a document through a hierarchical tree. Unlike traditional representation, the tree representation reflects the spatial organizations of words through pages and paragraphs of a document that help to encode better semantics of a document. Processing hierarchical tree is another challenging task in terms of computational complexity. We developed a hybrid neural network model, composed of SOM and MLP, for this task. Experimental results corroborate that our approach is efficient and effective in registering documents into organized tree compared with other approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 37, Issue 4, April 2010, Pages 2874–2881
نویسندگان
, ,