کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
382900 660796 2014 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Text classification using genetic algorithm oriented latent semantic features
ترجمه فارسی عنوان
طبقه بندی متن با استفاده از الگوریتم ژنتیک گرا ویژگی های معنایی نهفته
کلمات کلیدی
انتخاب ویژگی، الگوریتم ژنتیک، نمایه سازی معناشناسی باقیمانده، طبقه بندی متن
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• Genetic algorithm oriented latent semantic features are proposed.
• The proposed approach consists of feature selection and transformation stages.
• Genetic algorithms are employed in the selection of appropriate singular vectors.
• Singular vectors are not limited to the ones with largest singular values.
• The proposed approach outperforms standard LSI and feature selection methods.

In this paper, genetic algorithm oriented latent semantic features (GALSF) are proposed to obtain better representation of documents in text classification. The proposed approach consists of feature selection and feature transformation stages. The first stage is carried out using the state-of-the-art filter-based methods. The second stage employs latent semantic indexing (LSI) empowered by genetic algorithm such that a better projection is attained using appropriate singular vectors, which are not limited to the ones corresponding to the largest singular values, unlike standard LSI approach. In this way, the singular vectors with small singular values may also be used for projection whereas the vectors with large singular values may be eliminated as well to obtain better discrimination. Experimental results demonstrate that GALSF outperforms both LSI and filter-based feature selection methods on benchmark datasets for various feature dimensions.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 41, Issue 13, 1 October 2014, Pages 5938–5947
نویسندگان
, ,