کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
480129 1446064 2013 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A basic formula for performance gradient estimation of semi-Markov decision processes
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
A basic formula for performance gradient estimation of semi-Markov decision processes
چکیده انگلیسی

This paper presents a basic formula for performance gradient estimation of semi-Markov decision processes (SMDPs) under average-reward criterion. This formula directly follows from a sensitivity equation in perturbation analysis. With this formula, we develop three sample-path-based gradient estimation algorithms by using a single sample path. These algorithms naturally extend many gradient estimation algorithms for discrete-time Markov systems to continuous time semi-Markov models. In particular, they require less storage than the algorithm in the literature.


► We present a basic formula for gradient estimation of SMDPs.
► The formula directly follows from a sensitivity equation in perturbation analysis.
► We develop three gradient estimation algorithms based on the basic formula.
► These algorithms are generalization of Markov cases.
► These algorithms require less storage than the existing algorithm.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: European Journal of Operational Research - Volume 224, Issue 2, 16 January 2013, Pages 333–339
نویسندگان
, ,