کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
530917 | 869798 | 2014 | 17 صفحه PDF | دانلود رایگان |
• An optimal threshold is found out from a series of empirical threshold formulas developed for Fisher linear discriminants based on classification accuracies.
• Weight vectors and thresholds are updated by an epoch-limited iterative learning strategy.
• The singular within-class scatter matrices are reduced in dimensionality but not added with perturbations.
• A coding system enlarges class margins and approximately preserves neighborhood relationships.
• An integrated learning algorithm improves the learning and generalization performances of Fisher linear discriminants.
This paper studies Fisher linear discriminants (FLDs) based on classification accuracies for imbalanced datasets. An optimal threshold is found out from a series of empirical formulas developed, which is related not only to sample sizes but also to distribution regions. A mixed binary–decimal coding system is suggested to make the very dense datasets sparse and enlarge the class margins on condition that the neighborhood relationships of samples are nearly preserved. The within-class scatter matrices being or approximately singular should be moderately reduced in dimensionality but not added with tiny perturbations. The weight vectors can be further updated by a kind of epoch-limited (three at most) iterative learning strategy provided that the current training error rates come down accordingly. Putting the above ideas together, this paper proposes a type of integrated FLDs. The extensive experimental results over real-world datasets have demonstrated that the integrated FLDs have obvious advantages over the conventional FLDs in the aspects of learning and generalization performances for the imbalanced datasets.
Journal: Pattern Recognition - Volume 47, Issue 2, February 2014, Pages 789–805