کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
489580 704581 2015 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Extraction of Interlingual Documents Clusters Based on Closed Concepts Mining
ترجمه فارسی عنوان
استخراج اسلایدهای اسناد بینهایت بر اساس مفهوم بسته شده معدن؟
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی

To address multilingual document classification in an effcient and effective manner, we claim that a synergy between classical IR techniques such as vector model and some advanced data mining methods, especially Formal Concept Analysis, is particularly appropriate. We propose in this paper, a new statistical approach for extracting inter-language clusters from multilingual documents based on Closed Concepts Mining and vector model. Formal Concept Analysis techniques are applied to extract Closed Concepts from comparable corpora; and, then, exploit these Closed Concepts and vector models in the clustering and alignment of multilin- gual documents. An experimental evaluation is conducted on the collection of bilingual documents French-English of CLEF’2003. The results confirmed that the synergy between Formal Concept Analysis and vector model is fruitful to extract bilingual classes of documents, with an interesting comparability score.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 60, 2015, Pages 537-546