کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4375137 | 1303244 | 2011 | 7 صفحه PDF | دانلود رایگان |
Recent advances in computing technology have increased interest in applying data mining to ecology. Machine learning is one of the methods used in most of these data mining applications. As is well known, approximately 80% of the resources in most data mining applications are devoted to cleaning and preprocessing the data. However, there are few studies on preprocessing the ecological data used as the input in these data mining systems. In this study, we use four different feature selection methods (χ2, Information Gain, Gain Ratio, and Symmetrical Uncertainty) and evaluate their effectiveness in preprocessing the input data to be used for inducing artificial neural networks (ANNs) and decision trees (DTs). The presence/absence of fish is the data item used to illustrate our models. Feature selection is fundamental in order to increase the performances of the models obtained. Accuracy of classification improves when a small set of optimally selected features is used. DTs and ANNs are very useful tools when applied to modeling presence/absence of Alburnus alburnus alborella. ANNs generally performed better than DT models.
Journal: Ecological Informatics - Volume 6, Issue 5, September 2011, Pages 309–315