Article ID Journal Published Year Pages File Type
494639 Applied Soft Computing 2016 19 Pages PDF
Abstract

•For data driven models, design data should cover the whole data range.•Convex hull algorithms can be applied as a method for data selection.•A randomized approximation convex hull algorithm, ApproxHull, is proposed.•ApproxHull can be used for high dimensions, in an acceptable execution time, and with low memory requirements.•ApproxHull improves the performance of classification and regression models.

The accuracy of classification and regression tasks based on data driven models, such as Neural Networks or Support Vector Machines, relies to a good extent on selecting proper data for designing these models, covering the whole input range in which they will be employed. The convex hull algorithm can be applied as a method for data selection; however the use of conventional implementations of this method in high dimensions, due to its high complexity, is not feasible. In this paper, we propose a randomized approximation convex hull algorithm which can be used for high dimensions in an acceptable execution time, and with low memory requirements. Simulation results show that data selection by the proposed algorithm (coined as ApproxHull) can improve the performance of classification and regression models, in comparison with random data selection.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,