کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6892738 | 699056 | 2017 | 12 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Optimal decisions for continuous time Markov decision processes over finite planning horizons
ترجمه فارسی عنوان
تصمیمات بهینه برای زمان مداوم تصمیم گیری مارکوف بر افق برنامه ریزی محدود است
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
زمان پیوسته تصمیم گیری مارکوف، افق نهایی، یکپارچه سازی، تکنیک های عددی، بهینه سازی،
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
علوم کامپیوتر (عمومی)
چکیده انگلیسی
The computation of ϵ-optimal policies for continuous time Markov decision processes (CTMDPs) over finite time intervals is a sophisticated problem because the optimal policy may change at arbitrary times. Numerical algorithms based on time discretization or uniformization have been proposed for the computation of optimal policies. The uniformization based algorithm has shown to be more reliable and often also more efficient but is currently only available for processes where the gain or reward does not depend on the decision taken in a state. In this paper, we present two new uniformization based algorithms for computing ϵ-optimal policies for CTMDPs with decision dependent rewards over a finite time horizon. Due to a new and tighter upper bound the newly proposed algorithms cannot only be applied for decision dependent rewards, they also outperform the available approach for rewards that do not depend on the decision. In particular for models where the policy only rarely changes, optimal policies can be computed much faster.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Operations Research - Volume 77, January 2017, Pages 267-278
Journal: Computers & Operations Research - Volume 77, January 2017, Pages 267-278
نویسندگان
Peter Buchholz, Iryna Dohndorf, Dimitri Scheftelowitsch,