کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
480129 | 1446064 | 2013 | 7 صفحه PDF | دانلود رایگان |
![عکس صفحه اول مقاله: A basic formula for performance gradient estimation of semi-Markov decision processes A basic formula for performance gradient estimation of semi-Markov decision processes](/preview/png/480129.png)
This paper presents a basic formula for performance gradient estimation of semi-Markov decision processes (SMDPs) under average-reward criterion. This formula directly follows from a sensitivity equation in perturbation analysis. With this formula, we develop three sample-path-based gradient estimation algorithms by using a single sample path. These algorithms naturally extend many gradient estimation algorithms for discrete-time Markov systems to continuous time semi-Markov models. In particular, they require less storage than the algorithm in the literature.
► We present a basic formula for gradient estimation of SMDPs.
► The formula directly follows from a sensitivity equation in perturbation analysis.
► We develop three gradient estimation algorithms based on the basic formula.
► These algorithms are generalization of Markov cases.
► These algorithms require less storage than the existing algorithm.
Journal: European Journal of Operational Research - Volume 224, Issue 2, 16 January 2013, Pages 333–339