Article ID Journal Published Year Pages File Type
570187 Environmental Modelling & Software 2015 8 Pages PDF
Abstract

•Comparative evaluation of two model-free and a model-based input selection method.•Four synthetic and two real datasets are used for the comparative evaluation.•Model-free techniques: partial linear correlation (PLC) and partial mutual information.•Model-based technique based on genetic programming (GP).•Inputs selected by both PLC and GP are recommended as the significant inputs.

Appropriate selection of inputs for time series forecasting models is important because it not only has the potential to improve performance of forecasting models, but also helps reducing cost in data collection. This paper presents an investigation of selection performance of three input selection techniques, which include two model-free techniques, partial linear correlation (PLC) and partial mutual information (PMI) and a model-based technique based on genetic programming (GP). Four hypothetical datasets and two real datasets were used to demonstrate the performance of the three techniques. The results suggested that the model-free PLC technique due to its computational simplicity and the model-based GP technique due to its ability to detect non-linear relationships (demonstrated by its relatively good performance on a hypothetical complex non-linear dataset) are recommended for the input selection task. Candidate inputs which are selected by both these recommended techniques should be considered as significant inputs.

Related Topics
Physical Sciences and Engineering Computer Science Software
Authors
, , ,