کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
406628 678101 2014 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A methodology for training set instance selection using mutual information in time series prediction
ترجمه فارسی عنوان
یک روش برای آموزش انتخاب نمونه با استفاده از اطلاعات متقابل در پیش بینی سری
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Training set instance selection is an important preprocessing step in many machine learning problems, including time series prediction, and has to be considered in practice in order to increase the quality of the predictions and possibly reduce training time. Recently, the usage of mutual information (MI) has been proposed in regression tasks, mostly for feature selection and for identifying the real data from data sets that contain noise and outliers. This paper proposes a new methodology for training set instance selection for long-term time series prediction. The proposed methodology combines a recursive prediction strategy and advanced instance selection criterion—the nearest neighbor based MI estimator. An application of the concept of MI is presented for the selection of training instances based on MI computation between initial training set instances and the current forecasting instance, for every prediction step. The novelty of the approach lies in the fact that it fits the initial training subset with the current forecasting instance, and consequently reduces the uncertainty of the prediction. In this way, by selecting instances which share a large amount of MI with the current forecasting instance in every prediction step, error propagation and accumulation can be reduced, both of which are well known shortcomings of the recursive prediction strategy, thus leading to better forecasting quality. Another element which sets this approach apart from others is that it is not proposed as an outlier detector, but for the instance selection of data which do not necessarily have to contain noise and outliers. The results obtained from the data sets from NN5 competition in time series prediction indicate that the proposed method increases the quality of long-term time series prediction, as well as reduces the amount of instances needed for building the model.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 141, 2 October 2014, Pages 236–245
نویسندگان
, , , ,