کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
569569 1452279 2015 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improving partial mutual information-based input variable selection by consideration of boundary issues associated with bandwidth estimation
ترجمه فارسی عنوان
بهبود جزئی انتخاب متغیر ورودی مبتنی بر اطلاعات با در نظر گرفتن مسائل مرزی مربوط به تخمین پهنای باند
کلمات کلیدی
شبکه های عصبی مصنوعی، مدل های مبتنی بر داده ها، جزئی اطلاعات متقابل، برآورد تراکم هسته، پهنای باند هسته، مسائل مرزی، هیدرولوژی و منابع آب، انتخاب متغیر ورودی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزار
چکیده انگلیسی


• We address the important problem of the performance of the PMI IVS influenced by boundary and bandwidth issues.
• We develop approaches to improve the performance of the PMI IVS for non-Gaussian and non-linear problems.
• Boundary resistant methods exhibit greater success than methods focussed on boundary correction.
• The performance (selection accuracy) of PMI IVS is improved when accounting for boundary issues.
• Preliminary guidelines of bandwidth selection are developed for PMI IVS and successfully validated on two semi-real studies.

Input variable selection (IVS) is vital in the development of data-driven models. Among different IVS methods, partial mutual information (PMI) has shown significant promise, although its performance has been found to deteriorate for non-Gaussian and non-linear data. In this paper, the effectiveness of different approaches to improving PMI performance is investigated, focussing on boundary issues associated with bandwidth estimation. Boundary issues, associated with kernel-based density and residual computations within PMI, arise from the extension of symmetrical kernels beyond the feasible bounds of potential inputs, and result in an underestimation of kernel-based marginal and joint probability distribution functions in the PMI. In total, the effectiveness of 16 different approaches is tested on synthetically generated data and the results are used to develop preliminary guidelines for PMI IVS. By using the proposed guidelines, the correct inputs can be identified in 100% of trials, even if the data are highly non-linear or non-Gaussian.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Environmental Modelling & Software - Volume 71, September 2015, Pages 78–96
نویسندگان
, , ,