کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4946814 1439556 2018 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Evaluation of a novel GA-based methodology for model structure selection: The GA-PARSIMONY
ترجمه فارسی عنوان
ارزیابی یک روش جدید مبتنی بر GA برای انتخاب ساختار مدل: GA-PARSIMONY
کلمات کلیدی
الگوریتم ژنتیک؛ تنظیم پارامتر؛ انتخاب ویژگی؛ معیار صرفه جویی؛ مدل تطبیقی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


- Methodology to combine error minimization with parsimony criterion.
- GA-based method based on two consecutive ranks of the individuals.
- Complexity rank using the Wilcoxon signed rank test.
- Reduced number of features in models with best generalization errors.

Most proposed metaheuristics for feature selection and model parameter optimization are based on a two-termed Loss+Penalty function. Their main drawback is the need of a manual set of the parameter that balances between the loss and the penalty term. In this paper, a novel methodology referred as the GA-PARSIMONY and specifically designed to overcome this issue is evaluated in detail in thirteen public databases with five regression techniques. It is a GA-based meta-heuristic that splits the classic two-termed minimization functions by making two consecutive ranks of individuals. The first rank is based solely on the generalization error, while the second (named ReRank) is based on the complexity of the models, giving a special weight to the complexity entailed by large number of inputs.For each database, models with lowest testing RMSE and without statistical difference among them were referred as winner models. Within this group, the number of features selected was below 50%, which proves an optimal balance between error minimization and parsimony. Particularly, the most complex algorithms (MLP and SVR) were mostly selected in the group of winner models, while using around40-45% of the available attributes. The most basic IBk, ridge regression (LIN) and M5P were only classified as winner models in the simpler databases, but using less number of features in those cases (up to a 20-25% of the initial inputs).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 271, 3 January 2018, Pages 9-17
نویسندگان
, , , , ,