کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
10322315 | 660859 | 2012 | 10 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Is the contextual information relevant in text clustering by compression?
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
پیش نمایش صفحه اول مقاله
چکیده انگلیسی
Usually, when analyzing data that have not been processed or filtered yet, it can be observed that not all the data have equal importance. Thus, it is common to find relevant data surrounded by non relevant one. This occurs when analyzing textual information due to its intrinsic nature: texts contain words that provide a lot of information about the subject matter, whereas they contain other words with a little meaning or relevance. We believe that although in principle the non-relevant words are not as important as the relevant ones, the former constitute the substrate that supports the last. Since this substrate is the context that surrounds the relevant information, we call it the contextual information. In this paper, we analyze the relevance that the contextual information has in textual data, in a clustering by compression scenario. We generate the contextual information applying a distortion technique previously developed by the authors. One of the main characteristics of this technique is that it maintains the contextual information. In this paper we compare this technique with three new distortion techniques that destroy the contextual information in different ways. The experimental results support our hypothesis that the contextual information is relevant at least in the area of text clustering by compression.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 39, Issue 10, August 2012, Pages 8537-8546
Journal: Expert Systems with Applications - Volume 39, Issue 10, August 2012, Pages 8537-8546
نویسندگان
Ana Granados, David Camacho, Francisco Borja RodrÃguez,