کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
404060 | 677385 | 2008 | 7 صفحه PDF | دانلود رایگان |
Many learning and heuristic search algorithms require tuning of parameters to achieve optimum performance. In stationary and deterministic problem domains this is usually achieved through off-line sensitivity analysis. However, this method breaks down in non-stationary and non-deterministic environments, where the optimal set of values for the parameters keep changing over time. What is needed in such scenarios is a meta-learning (ML) mechanism that can learn the optimal set of parameters on-line while the learning algorithm is trying to learn its target concept. In this paper, we present a simple meta-learning algorithm to learn the temperature parameter of the Softmax reinforcement-learning (RL) algorithm. We present results to show the efficacy of this meta-learning algorithm in two domains.
Journal: Knowledge-Based Systems - Volume 21, Issue 8, December 2008, Pages 800–806