کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
388584 660930 2007 5 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A novel feature selection algorithm for text categorization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A novel feature selection algorithm for text categorization
چکیده انگلیسی

With the development of the web, large numbers of documents are available on the Internet. Digital libraries, news sources and inner data of companies surge more and more. Automatic text categorization becomes more and more important for dealing with massive data. However the major problem of text categorization is the high dimensionality of the feature space. At present there are many methods to deal with text feature selection. To improve the performance of text categorization, we present another method of dealing with text feature selection. Our study is based on Gini index theory and we design a novel Gini index algorithm to reduce the high dimensionality of the feature space. A new measure function of Gini index is constructed and made to fit text categorization. The results of experiments show that our improvements of Gini index behave better than other methods of feature selection.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 33, Issue 1, July 2007, Pages 1–5
نویسندگان
, , , , , ,