کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
405416 677560 2006 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A preprocess algorithm of filtering irrelevant information based on the minimum class difference
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A preprocess algorithm of filtering irrelevant information based on the minimum class difference
چکیده انگلیسی

Whether a word (or a feature) should be included or excluded during the process of text classification could depend on a number of factors, such as the amount of information it represents, its appearance frequency and its meaning. The application context is another important factor that needs to be considered. A word may be able to represent the characteristic of a document in one application context but may not reflect its nature in another. This paper reports on an investigation into the selection of features for classification with the consideration of the application context of the documents to be processed. A new feature selection algorithm for text classification to be known as the PBMCD algorithm is proposed. This algorithm has been implemented and tested using three different data sets. The experiment results have shown that this algorithm cannot only filter out irrelevant features before the classification process but also can increase the classification accuracy. As a comparison, experiment results with other methods have also been presented.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 19, Issue 6, October 2006, Pages 422–429
نویسندگان
, ,