کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6409220 | 1629911 | 2016 | 17 صفحه PDF | دانلود رایگان |

- A novel algorithm to identify alternate subsets of hydro-meteorological predictors.
- Relevance and redundancy of predictors are measured via information theoretic criteria.
- The algorithm is tested on synthetic datasets with known dependence relations.
- A streamflow prediction application gives insights into underlying physical processes.
This work investigates the uncertainty associated to the presence of multiple subsets of predictors yielding data-driven models with the same, or similar, predictive accuracy. To handle this uncertainty effectively, we introduce a novel input variable selection algorithm, called Wrapper for Quasi Equally Informative Subset Selection (W-QEISS), specifically conceived to identify all alternate subsets of predictors in a given dataset. The search process is based on a four-objective optimization problem that minimizes the number of selected predictors, maximizes the predictive accuracy of a data-driven model and optimizes two information theoretic metrics of relevance and redundancy, which guarantee that the selected subsets are highly informative and with little intra-subset similarity. The algorithm is first tested on two synthetic test problems and then demonstrated on a real-world streamflow prediction problem in the Yampa River catchment (US). Results show that complex hydro-meteorological datasets are characterized by a large number of alternate subsets of predictors, which provides useful insights on the underlying physical processes. Furthermore, the presence of multiple subsets of predictors-and associated models-helps find a better trade-off between different measures of predictive accuracy commonly adopted for hydrological modelling problems.
Journal: Journal of Hydrology - Volume 542, November 2016, Pages 18-34