کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
380153 1437423 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Online learning for optimistic planning
ترجمه فارسی عنوان
یادگیری برخط (آنلاین) برای برنامه ریزی خوشبینانه
کلمات کلیدی
کنترل بهینه، یادگیری ماشین، فرایندهای تصمیم گیری مارکوف، برنامه ریزی خوشبینانه، تحلیل نزدیک بهینگی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Markov decision processes are a powerful framework for nonlinear, possibly stochastic optimal control. We consider two existing optimistic planning algorithms to solve them, which originate in artificial intelligence. These algorithms have provable near-optimal performance when the actions and possible stochastic next-states are discrete, but they wastefully discard the planning data after each step. We therefore introduce a method to learn online, from this data, the upper bounds that are used to guide the planning process. Five different approximators for the upper bounds are proposed, one of which is specifically adapted to planning, and the other four coming from the standard toolbox of function approximation. Our analysis characterizes the influence of the approximation error on the performance, and reveals that for small errors, learning-based planning performs better. In detailed experimental studies, learning leads to improved performance with all five representations, and a local variant of support vector machines provides a good compromise between performance and computation.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 55, October 2016, Pages 70–82
نویسندگان
, , ,