Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
386960	660893	2009	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

WordNet Genetic algorithm - الگوریتم ژنتیک Text clustering - خوشه بندی متن Latent Semantic Indexing - نمایه سازی معنایی باقیمانده Ontology - هستی‌شناسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

چکیده انگلیسی

This paper proposes a self-organized genetic algorithm for text clustering based on ontology method. The common problem in the fields of text clustering is that the document is represented as a bag of words, while the conceptual similarity is ignored. We take advantage of thesaurus-based and corpus-based ontology to overcome this problem. However, the traditional corpus-based method is rather difficult to tackle. A transformed latent semantic indexing (LSI) model which can appropriately capture the associated semantic similarity is proposed and demonstrated as corpus-based ontology in this article. To investigate how ontology methods could be used effectively in text clustering, two hybrid strategies using various similarity measures are implemented. Experiments results show that our method of genetic algorithm in conjunction with the ontology strategy, the combination of the transformed LSI-based measure with the thesaurus-based measure, apparently outperforms that with traditional similarity measures. Our clustering algorithm also efficiently enhances the performance in comparison with standard GA and k-means in the same similarity environments.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 36, Issue 5, July 2009, Pages 9095–9104

نویسندگان

Wei Song, Cheng Hua Li, Soon Cheol Park,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

دسترسی سریع

ارتباط

English Website